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APPLICATION OF STATISTICAL METHODS TO 
ORDNANCE ENGINEERING* 


By Leste E. Stuont 


HEREAS quality control may be regarded as beginning in 1924, it 

\\) is principally the last decade that has seen the rise of the indus- 

trial statistician or quality control engineer. The men in this role 

remind one of a conversation between two men, one a close friend and 

the other a casual acquaintance of a third man of somewhat unusual 
attainments. 

“You say Jones is graduate both of law school and medical college?” 
asked the acquaintance. 

“Yes,” said the friend. 

“What is the result of such training?” continued the acquaintance. 

“Well,” said the friend, after a thoughtful pause, “all the lawyers 
call him Doc and all the doctors call him Judge.” 

For my own part, I deem it no discredit that engineers should regard 
me as a Statistician and statisticians regard me as an engineer. I am 
neither. I feel that I undoubtedly am a professional soldier engaged 
for the most part in the specialized field of ordnance, and that the some- 
what aloof position is not without its advantages. Thus the prides and 
prejudices often associated with one professional field do not cause me 
to hesitate to use the techniques of another if they are of practical 
assistance in attaining the objective of my problem: more and better 
arms for the soldier. 

This anomalous professional situation appears to be general among 
quality control engineers, and the growing demand of industry for 
engineers who have statistical training leads to the belief that the hybrid 
engineer-statistician is going to become one of the greatest industrial 
and technological assets of this decade. However, formal training can- 
not fully prepare one for practical problems. The clearly defined pro- 
cedures of the textbook and classroom may seem to be almost lost in 


* An address given at the Graduate School, the U. 8. Department of Agriculture, January 8, 1942, 
by authority of the Chief of Ordnance, and at the invitation of my friend Dr. W. Edwards Deming. 

t Lt. Colonel, Ordnance Department, U. 8. Army, Director of the Ballistic Research Laboratory, 
Aberdeen Proving Ground. 
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the pattern of everyday things. Frequently one must not only seek 
solutions through the use of methods which apply only approximately 
at best; but must, through alert observation of current work, recognize 
or create the opportunities for the use of statistical methods. Gold is 
where you find it, the old prospectors used to say. So it is with the 
applications of statistical methods. Opportunities for the application of 
at least some form of statistical methods undoubtedly abound in almost 
all fields, but they may be hidden under a mass of the drab details of 
practical work. I should like to illustrate the hybrid nature of applica- 
tions of statistics and the absence of any formal method in their dis- 
covery with an example with which I have been closely associated. It 
would be pleasant if I could stimulate your interest by suggesting a 
thrilling intellectual field like that presented by Mathematical Statis- 
tics, but both candor and sound policy suggest that instead I give a 
warning of the unappealing simplicity and drab practicality of these 
applications by quoting a few lines from an author whose name I have 
forgotten. 


I walked along the ocean shore 
And came upon a jelly fish— 
A freshly landed, sadly stranded, 

Iridescent, smelly fish. 


Simplicity of the control chart technique. As known and used up to the 
present, there is no statistical tool in the hands of the industrial statisti- 
cian that is as universally useful and as practically powerful as the con- 
trol chart. Its practical usefulness is due at least in part to its utter 
simplicity of application. Many a well planned scheme has failed be- 
cause it was too complicated, but I have never heard of one’s failing 
because it was too simple; and it is difficult to realize properly what a 
slight degree of mathematical methodology is promptly branded by 
industry as a mass of theoretical circumlocution. The control chart, 
when used with the average, X, and the range, R, slays the hydra- 
headed dragon of complication; deflates the hobgoblin of fear-of-the- 
theoretical; and requires a minimum amount of labor in any large 
scale application of statistical technique. 

The control chart’s usefulness inheres not only in its simplicity of ap- 
plication but also in its simplicity of concept. Almost any engineer will 
tell you that when test results differ too much from expected results 
(expected results, in general, being based on experience), there is either 
something wrong with the product or the test. If the engineer is honest, 
he will likely reflect a moment and add, “However, it is very difficult 
to tell when results differ too much.” The control chart does just that— 
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it tells simply and clearly and in an economical way, just when one 
should take action on the grounds that a set of test results differs too 
much from its predecessors. 

A new use for the control chart. The control chart has been applied for 
the most part to the control of manufacturing process and the judging 
or grading of the finished product with respect to quality. I believe 
that the Ordnance Department is the first to apply intentionally and 
systematically the control chart technique to the control of measure- 
ments. Ordinarily, one thinks of the arsenals as the manufacturers of 
defense products; but the Proving Ground turns out a product which 
is just as materially important as the munitions, and its product is 
measurements. If its flow of measurements is subject to assignable 
causes of variation, good munitions may be rejected and poor munitions 
accepted; if the range of variation of its measurements (accidental 
errors) is large, more rounds have to be fired in testing the product than 
if its measurements were more precise. In order to save valuable time 
and material, statistical methods are used in the control of the Proving 
Ground’s product. Although I must omit all technical and military de- 
tails, I believe I can still give you an interesting example. 

In the testing of munitions, it is often necessary to fire the product 
in a gun in order to obtain an observable manifestation of the quality 
characteristic under consideration. Under these circumstances, the gun 
is not a gun in the usual sense, but a part of a system of measuring in- 
struments. In scientific work, it is a cardinal principle that all measur- 
ing instruments must be calibrated. It is not so well recognized that 
such calibration should involve three considerations: accidental errors, 
semi-constant systematic errors, and mean constant errors. Together 
with the statistical method, I shall illustrate these considerations. 

In the proof of ammunition, the necessity for calibration of the gun 
as a measuring instrument has long been recognized in an engineering 
sort of way. At the beginning of each experiment made with a test gun, 
a series of rounds (say 5 to 7) is fired with standard components. If the 
nominal muzzle velocity of the gun is 1800 fps (feet per second), and 
the average of the calibration series is 1750 fps, the gun is said to have 
an erosion correction of +50 fps, and 50 fps is added to all observed 
velocities in connection with that phase of the experiment performed 
on the component subject to test. That is, it is presumed that due to 
wear, the gun is firing low. Of course, the correction tomorrow may be 
+60, and the next day +40 (see Chart I). Occasionally, during the 
early life of a gun, a negative erosion correction may be encountered, 
but customarily that kind of correction used to be ignored, since it was 
presumed impossible for a gun to wear backwards. 
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When I came upon the scene, as a statistician I said to myself, “This 
is utter nonsense. These people are treating as a part of a systematic 
error what is really nothing but an accidental error of observation. The 
wear of a gun is not a discrete but a continuous function.' One should 
merely average one’s calibration firings; keep a control chart (fitting 
the central line by least squares, if necessary); and as long as a single 
calibration series is within control limits, one should make a correction 


CHART I 
CONTROL CHART FOR VELOCITY OF A GUN 
Averages and Dispersions for Groups of 7 


Averages with frend hne: K=2202.36- 0.0/26/ N 
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which is equal to the difference between the nominal velocity and the 
central line of the control chart. Thissmethod is obvious and manifest.” 

However, as an engineer, I said, “Quiet! Say nothing about this yet, 
unless you want to damage your engineering reputation. Statistically, 
you may make a stupid statement because you are not sufficiently 
aware of the engineering considerations that underlie the phenomenon.” 

As an Ordnance Officer, I said in Al Smith’s well-known phraseology, 
“Let’s examine the record.” 

I got the records of firings reaching back into the years. I examined 
the lives of guns that had been fired to the end of their test lives. Chart 
I is a typical control chart resulting from such an examination. 
Whether I used the symmetric +3 sigma limits? or percentage limits 


1 Neglecting, of course, discontinuities such as those that might result from periodic coppering 
and decoppering of the bore. 

2 For discussion of kinds of control chart limits see “Control Chart Method of Analysing Data,” 
Z1.2-1941, p. 11. The American Standards Association, 29 West 39 Street, New York. See also Leslie E. 
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(usually the 0.001 and 0.999 limits for X¥ and the 0.005 and 0.995 limits 
for R), the results were always the same. The dispersions would show 
very satisfactory control. The averages invariably showed frightful 
lack of control, with from 10 to 40 per cent of the points outside of 
limits. Fitting the central line by least squares or allowing it to go down 
by steps through the life of the gun made little difference. Certainly 
external causes were being superimposed upon the random cause system 
which should have governed the behavior of the means. The contrast 
between the quite satisfactory control of the dispersions and the lack 
of control of the means made this obvious, but I had no improvement 
to offer on an existing system of doubtful validity until I could explain 
both why it was invalid and what should be done about it. 

Naturally one suspects cyclical trends in data such as these. Statisti- 
cal tests confirmed the suspicion of lack of normality. I tried to cor- 
relate the deviations of the sample means from the grand mean with 
the time of the year, the temperature, humidity, air density and various 
other phenomena without success. It is a widely accepted axiom that 
the statistical method merely detects trouble. I tried a large number of 
the tricks of the trade, but the statistical attack would not yield a solu- 
tion of the causes of the trouble. Its identification and elimination were 
engineering problems. Here was freshly landed, sadly stranded, iri- 
descent, smelly fish. 

Engineering analysis of the record. One might well suspect that the 
standard components with which the gun was fired were not in a state 
of statistical control. However, ordnance products, especially those 
manufactured in the regular arsenals, tend toward a state of statistical 
control even when manufactured only under engineering control, per- 
haps because they are removed by several degrees of refinement from 
the initial raw material. This condition may be contrasted, for example, 
with the steel industry. Control also is fostered by the fact that the 
hazardous character of the material renders it necessary that it be 
scrupulously manufactured on a quality rather than a dollar basis. 
Furthermore, the most influential component in the calibration problem 
is the powder. I used to be closely associated with this component as 
Assistant Chief of Manufacture at Picatinny Arsenal, where it was 
made, and ran many quality control tests on it at that time. Hence, I 
believed the powder was controlled. However, conditions like tbose 
shown in Chart I make one doubt almost anything. I wondered if the 
broken sequences arising from the fact that only the calibration firings 





Simon, An Engineers’ Manual of Statistical Methods, John Wiley and Sons, New York, 1941, Chapters 
IV, VII and Appendix C. 
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are shown on the control chart, with the intervening service firings 
omitted, could have anything to do with it, so I decided to run control 
charts on some series of firings of the same set of components where 
no other components interrupted the series. 

Series firings of the same set of components are almost always short, 
and are fired in a single day or two days. I gathered what I could. By 
diligent search of the records a considerable number of charts were 
made which were like Chart II except for length. Lack of control was 
almost never indicated either for standard components or even for non- 
standard components submitted for test. 

Now there were enough pieces of the puzzle assembled to fit into an 
integrated whole. Firings show control within the day. Firings fre- 
quently show lack of control from day to day. Therefore, the lack of 
control rests not with the components and not with the gun but with 
the periodic introduction of a different systematic error in the process 
of making observations.’ That is, there is (1) a mean error, or rather 
deviation from standard, associated with a short series which is due to 
wear of the gun (this causes the central line to slant), (2) a systematic 
error which is constant for a day or a set-up of apparatus, but which 
varies from day to day (X is temporarily shifted up or down, thereby 
causing points outside of the 3 sigma limits), and (3) there is the usual 
accidental or random sampling error (which may sometimes cause 
points to fall still farther outside the 3 sigma limits, or sometimes bring 
them in). 

This analysis is in agreement with engineering considerations. The 
so-called accidental error is due to a combination of sampling fluctua- 
tions in the product (the standard components) and accidental errors of 
observation. The functioning of the systematic error is not so obvious, 
but it can easily be explained. Velocity is an indirect measurement 
which is inferred from the measured time-interval required for the 
projectile to pass from one screen erected along its trajectory to 
another. Having set up the screens and other necessary apparatus, 
these conditions remain relatively constant during an experiment. But, 
upon the repetition of the experiment (another standard powder firing) 
the screens are set a little closer or a little farther apart, the other neces- 
sary apparatus is slightly changed, and a different systematic error is 
introduced for that experiment. About 17 independent things affect 
this systematic error, sometimes called the error of the day. 

Now one may talk freely about the trouble either as a statistician, 


3 For an earlier study see Student, “Errors of Routine Analysis,” Biometrika XIX, pp. 151-164, 
1927. 
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engineer, or professional worker in a special field; but the remedy for 
the trouble is not yet at hand. 

The engineering and statistical analyses have now changed the 
original problem of “erosion” corrections and complicated it by logically 
injecting the issue of control. If observed average velocities are accepted 
without restriction, the manner of application of the “erosion’”’ correc- 
tion may be relatively unimportant, since a large error in the set-up 
of the measuring apparatus may result in an entirely invalid average 
velocity. Therefore, one must first seek criteria whereby he can judge 
whether the observed value should be accepted at all or whether action 
should be taken to discover assignable causes for variation in the meas- 
uring system, or perhaps even repeat the experiment. 

An entirely satisfactory solution for this rather complicated problem 
is not likely, but surely statistical methods have something to offer 
which is better than unaided judgment. In the first place, we already 
know o,. If one knew the true average velocity for each of the respective 
set-ups, it would be a simple matter to calculate o, (the standard devia- 
tion of the semi-constant or systematic error). Then, by the rule of the 
sum of two variables, the total variation for observed averages of 


samples of n would be 
Ta" 
7 = ° + . 
n 


However, the systematic deviations cannot be directly observed. 
Only n measurements of each respective deviate are known. Hence, 
were it not for the wear of the gun; i.e., the progressively shifting mean 
as shown by the trend line of Chart I, er could be calculated directly 
from the observed means. Fortunately, there is a very simple method 
of eliminating the effect of a shifting mean, which has been used in 
exterior ballistics for a number of years. One merely calculates cr as 
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Thus, one can compute ga, o,, and or. One can now establish control 
chart limits for averages, from sample size n, at X +3er. This control 
chart leaves some things to be desired, for from the viewpoint of syste- 
matic errors the sample size is one, and furthermore, from the view- 
point of systematic errors it falls short of being a control chart in the 
Shewhart sense, as it merely detects whether future measurements ap- 


CHART II 
CONTROL CHART FOR VELOCITY OF A GUN 
Averages and Dispersions for Groups of 7 


a Averages with trend tine: % = 2202.38 -0.0/26/ N 
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pear to be better or poorer than past measurements. However, it ap- 
pears that this procedure certainly will not tend to indicate trouble 
when there is no trouble. This is healthful, for a system which cries, 
“Wolf! Wolf!”, when there is no wolf will at once be regarded as ri- 
diculous and discarded. It is essential, especially at the outset, that 
there be trouble, when it is so indicated, and that it be found. 

® When reputable engineers do work under carefully controlled condi- 
tions it is not rife with errors—neither is it entirely free of them. If the 
control chart procedure just outlined be applied to past data, certainly 
one cannot find the assignable causes of variation when such causes 
are indicated, since the measuring set-ups were dismantled long ago. 
However, one can observe whether the frequency with which trouble is 
indicated is reasonable. Let us apply the procedure to the data of 
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Chart I (real data so far as the statistical illustration is concerned, it 
has merely been altered so as to rob it of all military or engineering 
significance without prejudice to its illustrative value). Chart II shows 
this application. No point of those shown is beyond the 3er control 
limits. (There was one out in the data not shown.) In like manner, the 
procedure was tried against very much more proving ground data be- 
fore it was recommended for application. 

I shall wish to say more about the use of this special procedure, but 
in the interest of coherence it must be observed that the original prob- 
lem was the “erosion” correction. That problem has not been solved. 
The engineer said that the correction should be the nominal velocity 
minus the observed. The statistician said that it should be the nominal 
velocity minus the current value of X. At this point the issue became 
confused, for without some knowledge of statistical control, one knows 
little about X. The statistical problems are now fairly well cleared up. 
Are we now able to answer the question of the best value to take for the 
so-called “erosion” correction? 

Yes, we can at least give an approximate answer, but first let us make 
several observations. If o./,/n were very large as compared with o, 
(see Chart IIIa), certainly one should take the central line as the value 
of that day’s velocity as suggested by the statistician. This is precisely 
what he had in mind when he suggested it. If ¢, were very large as com- 
pared with o./4/n (see Chart IIIb), certainly one should take the 
point itself as the best estimate of the day’s velocity as was the past 
practice. However, neither of these conditions obtains in practice, and 
in general the ratio of o, to o./\/n lies between } and 4 (see Chart 
IIIc). 

If a point happens to fall within the 3e7 limits but outside of the 3c, 
limits, we are almost certain that the observation has suffered from 
sampling fluctuations, for the 3c, limits mark the range of practically 
all true values of X. Hence, we should certainly assume that the true 
value for the day is at least within the nearest boundary of the 3e, 
range. It may be well within the range; it may be directly on the 
central line. However, knowledge ceases with the boundary, and any 
advance within the boundary is an advance into the unknown. 

If a point happens to fall within the 3¢, limits, I can see no reason for 
not accepting the observations as the true value of the velocity for that 
day. 

There may be better rules than those shown in Chart IIIc. Certainly 
these are open to considerable criticism.‘ 


4 Since the date of this address, a solution has been obtained at the Ballistic Research Laboratory 
which minimizes the root mean square error of the estimated “true value for the day.” 
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However, the important point of the whole standard powder proce- 
dure turns out to be the control chart technique; i.e., the rejecting of 
extreme and improbable values, and the identification and elimination 
of the causes that produced them. The +3e, band is in general rela- 
tively narrow, and the error committed is small even if the boundary 
of the band is assumed to be the true “velocity of the day,” when ac- 
tually the true value is the central line. This condition exists, because 
the really serious errors do not occur under the control chart technique. 
Hence, the injection of the issue of control was not really a digression 
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from the subject of “erosion” correction. It is the major part of the solu- 
tion. It is further obvious that increased accuracy can be obtained only 
through increased sample size, and that the velocity of the day may be 
known with whatever accuracy one chooses, at the expense of increased 
samples. 

This may appear to be rather a simple problem with a simple answer, 
but things do not seem that way when one is working on them. What 
I have outlined represents part-time work for a year;> and whereas I 
feel that I should have made swifter progress, considerable experience 
in research and development work indicates that all new things—even 
new or only slightly different applications of old techniques—come 
slowly. If a man can make a substantial contribution to only one or two 


5 Iam indebted to Dr. Walter A. Shewhart of the Bell Telephone Laboratories and Dr. W. Edwards 
Deming of the Bureau of the Census for helpful advice and consultation on this problem. 
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new things in a year, it is a considerable accomplishment. I make these 
remarks because one who is engaged in quality control work is apt to 
feel that his progress is slow, and subject himself to a considerable 
amount of harsh self-criticism, when actually he is making very credita- 
ble progress. 

Of course one wonders how the procedure worked out in practice. Up 
to the present, it is working very well indeed, and the credit for its suc- 
cess rests for the greater part with Mr. David Kinsler of the Aberdeen 
Proving Ground, who had charge of its practical application. By virtue 
of good initial preparation, energy, resourcefulness, and persevering 
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application, he ironed out some wrinkles in the plan, gained its accept- 
ance on the part of operating personnel, and made several substantial 
improvements in its statistical and engineering features. 

Charts like Chart II have now been kept on many guns, although 
the grand average is generally plotted horizontally by steps instead of 
on a slant. Chart IV shows the first working chart that was kept, and 
the first point which was out of control. It will be noted that a con- 
siderable number of points had to be accumulated, before control 
limits could be predicated. (At the present time approximate limits 
can generally be predicted at the outset.) During this period, a point 
occurred which was out of control and which, of course, was not recog- 
nized at that time. Work actually started at about round number 500, 
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and lack of control was indicated at round number 675. At that time 
we were running what I call “fini la guerre”; i.e., when a point goes 
out of control, everything stops until the trouble is found. The firing 
took place at noon, and it was 3:30 p.m. when we found the trouble. 
That first trouble just had to be found. It was an error in screen dis- 
tance. That is, the two velocity screens were supposed to be 200 feet 
apart. An error of an even foot had been made in their measurement 
thereby resulting in a velocity determination which was } per cent 
too low. Since this measurement is ordinarily checked and double 
checked, trouble was not sought in that locality except as a last re- 
sort. However, it is hard to tell when and where people will make er- 
rors. The catching of this error surely put a feather in the cap of quality 
control. Care was taken to indulge in no recriminations or to place no 
undue blame on anyone for honest mistakes. Instead, everyone was 
encouraged to get behind the system and work for better measure- 
ments. At the last report I had on the system, 19 instances of lack of 
control had occurred and the assignable cause had been located in all 
instances except one. Of course, all instances of lack of control may not 
have been detected, but the errors could not have been large, for the 
control limits prohibit large errors. Hence, there is very strong assur- 
ance of high quality measurements from the Proving Ground. 

I think also, there is an interesting corollary to the control program. 
During this emergency, proving ground activities have expanded 
enormously and many relatively untrained persons are charged with 
important measurements. A decline in standards is to be expected. 
Yet, under existing management and with the control chart technique, 
the charts show that the precision is no poorer than it was a year ago; 
in fact, it appears to be slightly better. This is a contribution of statis- 
tical methods to the war. 























TESTS OF SIGNIFICANCE CONSIDERED 
AS EVIDENCE* 


By Josep Berkson, M.D. 
Division of Biometry and Medical Statistics, Mayo Clinic 


“After all, the higher statistics are only common sense 
reduced to numerical appreciation.”—Karut Prarson. 


HERE WAS a time when we did not talk about tests of significance; 
a simply did them. We tested whether certain quantities were 
significant in the light of their standard errors, without inquiring as to 
just what was involved in the procedure, or attempting to generalize it. 
In recent years tests of significance have been more broadly conceived 
as tests of hypotheses, and they have been generalized as ¢ tests, F tests 
and certain amplifications of these, such as analysis of variance or of 
covariance. It is hardly an exaggeration to say that statistics, as it is 
taught at present in the dominant school, consists almost entirely of 
tests of significance, though not always presented as such, some com- 
paratively simple and forthright, others elaborate and abstruse. Behind 
this is a doctrine of analysis that consists of setting up what is called a 
“null hypothesis” and testing it. Indeed, in this conception not only 
does this procedure characterize the method of statistics, but it is con- 
sidered to be the very essence of all experimental science. In his well 
known book, The Design of Experiments, R. A. Fisher wrote, “Every 
experiment may be said to exist only in order to give the facts a chance 
of disproving the null hypothesis.”! 

What is this null hypothesis procedure? I quote from a recent text.? 

We have just set up the hypothesis that our sample of 900, which has a 

mean of 15,071 miles, is a random sample drawn from the population having 
a known mean of 15,200 miles. . . . Such a hypothesis is called a null hypoth- 
esis since our computations undertake to nullify it. The procedure may be 
summarized into three steps: (1) Set up the hypothesis that the true differ- 
ence is zero. (2) Upon the basis of this hypothesis determine the probability 
that such a difference as the one observed might occur because of sampling 
variations. (3) Draw a conclusion concerning the hypothesis. If such ob- 
served difference could hardly have occurred by chance, we have cast 
much doubt upon the hypothesis. We therefore abandon the hypothesis 
and conclude that the observed difference is significant. 


* A paper presented at the 103rd Annual Meeting of the American Statistical Association, New 
York, December 29, 1941. 

1 R.A. Fisher, The Design of Experiments. Ed. 2, London, Oliver and Boyd, Ltd., 1937, p: 19. 

2 F. E. Croxton and D. J. Cowden, Applied General Statistics. New York, Prentice-Hall, Inc., 1940, 


p. 310. 
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This I believe is a fair if abbreviated statement of the essential 
procedure as it is generally understood. If the experience at hand would 
occur only very infrequently in a given hypothesis, the hypothesis is 
considered disproved. 

The argument has an apparent plausibility and for many years I ad- 
hered to it. However, set against experience with actual problems, re- 
flection has led me to the conclusion that it is erroneous, and that a re- 
evaluation will lead to clearer comprehension in the application of tests 
of significance and also serve as a corrective of some of its misuses. 

In the first place, the argument seems to be basically illogical. Con- 
sider it in symbolic form. It says “If A is true, B will happen some- 
times; therefore if B has been found to happen, A can be considered dis- 
proved.” There is no logical warrant for considering an event known to 
occur in a given hypothesis, even if infrequently, as disproving the hy- 
pothesis. 

More to the present point, the argument does not seem to accord 
with what would be the mode of reasoning in ordinary rational dis- 
course, nor with the rationale of usual procedures as they are observed 
in the scientific laboratory. Suppose I said, “‘Albinos are very rare in 
human populations, only one in fifty thousand. Therefore, if you have 
taken a random sample of 100 from a population and found in it an 
albino, the population is not human.” This is a similar argument but if 
it were given, I believe the rational retort would be, “If the population 
is not human, what 7s it?” A question would be asked that demands an 
affirmative answer. In the null hypothesis schema we are trying only to 
nullify something: “The null hypothesis is never proved or established 
but is possibly disproved in the course of experimentation.” But ordi- 
narily evidence does not take this form. With the corpus delicti in front 
of you, you do not say, “Here is evidence against the hypothesis that 
no one is dead.” You say, “Evidently someone has been murdered.” 

Nor do you find experimentalists typically engaged in disproving 
things. They are looking for appropriate evidence for affirmative con- 
clusions. Even if the mediate purpose is the disestablishment of some 
current idea, the immediate objective of a working scientist is likely to 
be to gain affirmative evidence in favor of something that will refute 
the allegation which is under attack. 

Does this mean that the application of tests of significance is in basic 
disaccord with rational scientific procedure? I am not sure. I think that 
there is a possibility of using them soundly, but the rule of inference on 
which they are supposed to rest has been misconceived, and this has led 
to certain fallacious uses. 
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Consider the objective of testing whether a distribution is normal. 
One could validly say, “If the distribution is normal and, the skewness 
of the sample, g:, having been calculated, if a die of 100 faces, five of 
which are black, is thrown at random, a black face will occur only five 
times in 100.” No one would suggest that the finding of a black face on 
a die following such a calculation is any reason for rejecting the null 
hypothesis that the distribution is normal. But when one says, “If the 
distribution is normal, a value of g:/S,, = 1.96 will occur only five times 
in 100,” the finding of such a value for g:/S,, is taken as reason for re- 
jecting the null hypothesis. What is the essential difference between the 
two situations? Following the procedures which were outlined for deal- 
ing with a null hypothesis, one should reject the hypothesis that the 
distribution is normal on the finding of a black face, for it is surely an 
event rare in the circumstance of the distribution being normal. The 
difference appears to be that we recognize that if the distribution 
actually were abnormal (skew), the occurrence of a black face still 
would not be expected, but a large value of g:/S,, would be expected. The 
latter constitutes evidence in favor of skewness. We may discern, as 
operating in the realm of tests of significance, a principle that I suggest 
is generally operative in scientific inquiry; it is this. The finding of an 
event which is frequent under a hypothesis H, can be taken as evidence 
in favor of H,. If Ho is a contradictory alternative to H; for which the 
event would not be frequent, then per corollary the finding of the event 
is, in so far, evidence in disfavor of Ho. 

At this point I can imagine the question rising, “What difference does 
it make whether you say that you reject Hy because for it the event is 
not frequent, or because you are accepting the alternative H; for which 
it is frequent?” To this the first answer must be that it would seem to 
be a sound idea to get one’s head clear as to what are the principles on 
which one is really acting. If an event has occurred, the definitive ques- 
tion is not, “Is this an event which would be rare if Ho is true?” but “Is 
there an alternative hypothesis under which the event would be rela- 
tively frequent?” If there is no plausible alternative at all, the rarity is 
quite irrelevant to a decision, and if there is such an alternative, the 
decisive question is, “Would the event be relatively frequent?” Sec- 
ondly, the pursuit of a false principle for testing the null hypothesis 
will lead to false conclusions that will be avoided if one is consciously 
guided by the principle suggested here as being the correct one. I shall 
cite an example. 

As an illustration of a test of linearity under the caption, “Test of 
straightness of regression line,” R. A. Fisher utilizes data relating the 
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temperature to the number of eye facets of Drosophila melanogaster, 
the facet number being measured in factorial units. An analysis of 
variance procedure is utilized for the test and, the calculations having 
been made, Fisher says, 


The deviations from linear regression are evidently larger than would be 
expected, if the regression were really linear, from the variations within the 
arrays. For the value of z we have 1.2434 while the 1 per cent point is about 
.488. There can therefore be no question of the statistical significance of 
the deviations from the straight line . . . the departure from linearity was 
markedly significant.* 


I have plotted the data of mean facet number in relation to tempera- 
ture together with the least square line and they are shown in Chart I. 


CHART I 


MEAN NUMBER OF EYE FACETS OF DROSOPHILA MELANOGASTER RAISED 
AT DIFFERENT TEMPERATURES AND BEST FITTING STRAIGHT 
LINE BY METHOD OF LEAST SQUARES 
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Source: Data from R. A. Fisher, Statistical Methods for Research Workers. 
London, Oliver and Boyd, 1938, p. 260. 


{t was found by the significance test as applied that this regression was 
not straight, but on inspection it appears as straight a line as one can 


*R. A. Fisher, Statistical Methods for Research Workers. Ed. 7, London, Oliver and Boyd, Ltd. 
1938, pp. 259-265. 
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expect to find in biological] material. What has betrayed the author is a 
faithful adherence to an unsound principle: to wit, reject the null hy- 
pothesis tested, in this case that the regression is linear, if the P of the 
test is small. 

Let us consider the problem according to the principle advanced here. 
The event which has been found to have happened, in this case the 
small P, is to be considered as evidence in favor of any hypothesis under 
which it would be a frequent occurrence. Under what hypothesis would 
the P, considering its mode of calculation, be a frequent occurrence? If 
the regression were curvilinear, a small P is to be expected relatively 
frequently. In so far as this is so, a small P is evidence in favor of curvi- 
linearity and because of this and primarily because of this, a small P can 
be considered evidence in disfavor of its alternative, linearity. But also 
a small P is to be expected relatively frequently if the regression is lin- 
ear and the variability heteroscedastic; hence a small P is also evidence 
in favor of linearity plus heteroscedasticity. Or again a small P is to be 
expected frequently if the regression is linear and a value of the abscis- 
sal variate, in this case the temperature, is not constant but subject to 
fluctuation. And there may be other conditions which, with linearity, 
would produce a small P relatively frequently. The small P is favorable 
evidence for any or several of these. Which of these shall be taken to 
have been demonstrated by the evidence of the small P will have to be 
determined by other evidence, possibly other statistical tests. In this 
case my own judgment would be, not that the regression is nonlinear, 
but that the temperature has varied during each or some of the experi- 
ments. At least that would explain the small P. 

According to what is advocated here, we cannot lay down any pat 
axiomatic rules such as “A very small P disproves the hypothesis 
tested,” or “Equally, a very high P disproves the hypothesis,” for it is 
not primarily the infrequency of the P which gives the finding its mean- 
ing. Each test will have to be examined and the circumstances in which 
it is applied will have to be examined, to find out, as best we can, 
whether any particular regions of P will occur relatively frequently in 
the case of an alternative to the tested hypothesis. There are situations 
in which a very large P will be frequent in an alternative, end in these 
circumstances, but only in these circumstances, a very high P can be 
said to disfavor the null hypothesis. I cite an example. 

If with (n+1) observations from a frequency distribution of a variate 
x the quantity ns?/Z is calculated, where #=22/(n+1) and s*= 
=(x—£)?/n, it is known that the quantity is distributed in random 
samples as x? for n degrees of freedom, if the distribution is Poisson. 
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Small values of P, say P £0.05, will occur with the small frequency of 
five times in 100. If, however, the distribution is what has been called 
supernormal, a distribution that is known to characterize certain physi- 
cal situations, the variance o? is greater than the mean y, and in random 
samples large values of the quantity x’, and correspondingly low values 
of P<0.05 will be more frequent than five in 100. The finding of a 
P <0.05 therefore can be taken as preponderant favorable evidence for 
the super-Poisson, and hence as unfavorable to the null hypothesis 
tested that the distribution is Poisson. Similarly, if the distribution is 
Poisson, large values of P, say P 20.95, will occur with the small fre- 
quency, five times in 100. If, however, the distribution is Bernoullian- 
binomial or sub-Poisson, the variance o? will be less than the mean uz, 
and small values of the quantity x? and correspondingly large values of 
P=0.95 will be more frequent than five times in 100. The finding of a 
P=0.95 therefore can be taken as preponderant favorable evidence for 
the Bernoullian or sub-Poisson, and hence as unfavorable to the null 
hypothesis tested that the distribution is Poisson. Here then is a case in 
which either a very low value of P or a very high value can be con- 
sidered as warrant for rejecting the null hypothesis. There are other 
such cases, but the rule is not general. 

So much for the meaning of P’s which are relatively frequent in the 
case of an alternative, and in so far, are evidence in disfavor of the null 
hypothesis tested. In the cases in which a very low P or very high P is 
evidence in favor of an alternative, what can we say of the finding of a 
middle value of P, say a P in the region 0.3 to 0.7? Statistical authors 
are not very clear about this. For the most part they merely confine 
themselves to statements that a low P disproves and one which is not 
low does not disprove. In some cases-they say explicitly that a low P 
disproves but one which is not low does not prove the null hypothesis. 
What such a P should mean according to the principle advanced here 
is unequivocally clear. Since by definition such P’s will occur fre- 
quently in the case in which the null hypothesis is true, the finding of 
one is to be taken as prima facie evidence in favor of the null hypothesis. 
That is in fact the way the statistician uses them, in contradistinction 
to the way he says they should be used when he describes the testing of 
the null hypothesis. 

This was somewhat amusingly illustrated at one of our meetings. 
One of our most eminent members gave a paper presenting the appli- 
cation of the lambda test and used for illustration data designed to test 
a certain Mendelian hypothesis. The data having been examined and 
the test applied, a P of about 0.6 was found. “We can say therefore,” he 
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remarked, “that the results substantiate the hypothesis.” He applied 
the test illustratively to several other sets of data successively and get- 
ting a P of considerable size, each time he said, “The results therefore 
substantiate the hypothesis.” When he was finished, an equally eminent 
mathematical colleague rose to object and said, “You cannot say that 
the results of the test support the hypothesis; all you are able to say is 
that they have not in these data disproved it.” The most interesting 
part of the colloquy is that the first mathematician accepted the correc- 
tion! 

This I find is rather typical. In the abstract the mathematical stat- 
istician insists that a middle value of P only fails to refute the hypothe- 
sis, but if he is dealing with real data and gets interested in the physical 
problem in hand, he forgets his statistical principles and relapses to the 
rules of inference applied generally in such problems. 

That statisticians with real problems in hand do interpret a middle P 
as positive support for the null hypothesis can be readily illustrated by 
innumerable examples to be found in the literature. I shall cite one that 
is in a field in which I once did some work. “Student,” in his classic pa- 
per on the error of count with a hemocytometer,‘ used a series of data 
to examine whether the actual distribution in the hemocytometer fol- 
lowed the Poisson distribution, as it should on certain physical as- 
sumptions. He applied the Pearson chi-square test to a number of series 
and finding the P’s taken together fairly large, he concluded that the 
distribution was sensibly Poisson, and that therefore the variability 
could be taken as the square root of the average count. If this positive 
conclusion in favor of the null hypothesis tested was not obtained from 
the relatively high P’s, then his statistical work was entirely irrelevant. 
Other examples of the use by statisticians of relatively high P’s for 
demonstration of the null hypothesis are easily found if one keeps a 
weather eye open for them. 

When I say that a middle value of P is to be considered valid evi- 
dence in favor of the null hypothesis, I have by no means resolved all 
the pertinent questions that may be asked regarding it. I do not say 
anything has been “proved” or “disproved.” I leave to others the use of 
these words, which I think are quite inadmissible as applying to any- 
thing that can be accomplished by statistics. All I say is that what we 
have is in the nature of positive supporting evidence. Whether the evi- 
dence is of sufficient weight to be convincing is another matter. 

The development of what should be taken to affect the weight of the 
evidence is beyond anything I wish to undertake but a few pertinent 


4 Student, “On the Error of Counting with a Haemacytometer,” Biometrika 5: 351-360, 1906-1907. 
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remarks I do wish to make. Whereas it can be said that the evidence 
provided by a small P correctly evaluated is broadly independent of 
the number in the sample from which it has been calculated, this is not 
true for such evidence as is provided by a P in the middle region, say 
0.3 to 0.7. Consider Table I depicting the hypothetical results of a 
physician’s judgments based on a serological test, designed to ascertain 
the sex of a fetus in utero. Examine experience 1, divest yourself of for- 
mal rules, and consider what would be your reaction. I think I can 
fairly guess that it would be something like this: “We cannot say any- 
thing from this experience: it certainly does not present any convincing 


TABLE I 
HYPOTHETICAL RESULTS: DETERMINATION OF SEX 








Experience 1 


Experience 2 





























Category Judgement of sex Judgment of sex 
Total Total 
Correct Incorrect Correct | Incorrect 
Expected by chance 10 5 5 1000 500 500 
Physician's judgment 10 6 4 1000 505 495 
P 0.38 0.38 








evidence that the physician can discriminate between the sexes. But I 
should not want to say either that he cannot discriminate. The experi- 
ence is too small for any conclusion.” With experience 2 I think you 
would say, or at any rate I should: “There is no question in my mind; 
quite evidently the physician does not possess any ability to discrimi- 
nate by this serological test between the sexes. The experiment is quite 
large enough, and if he could discriminate to any significant degree we 
should see it in the results, which we do not.” 

Now for both experiences, the P, which is the probability of obtain- 
ing by chance as good a result as the one obtained, on the null hypothe- 
sis that the probability of either sex is a half, is the same, namely, 0.38. 
But the experience 2, being based on large numbers, is convincing 
positive evidence of the truth of the null hypothesis within practical 
limits. I do not intend to attempt to analyze what is the justification 
for the added conviction provided when the numbers are large, beyond 
suggesting that it has the same basis as what has been argued here is the 
general principle of inference which is operative throughout. When the 
numbers are small, a middle P will occur with considerable frequency 
if the null hypothesis is true or if an alternative is; with large numbers 
such a P will occur frequently in the case of the null hypothesis but not 
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in the case of a practical alternative. Hence with large numbers, a 
middle P provides probative evidence in favor of the null hypothesis. 

Here we have disclosed one fundamental weakness in the position of 
those who contend that small samples can be effectively utilized in 
statistical investigations if the calculations of the P’s are correctly 
made. If it were a fact that conclusions are drawn only when the P is 
very small and the null hypothesis disproved, then so far as concerns 
the main considerations here developed, there would be a certain 
validity to this view, for small P’s are more or less independent, in the 
weight of the evidence they afford, of the numbers in the sample. But 
if actually it is the fact that conclusions will be drawn from P’s which 
are not small, then only very considerable numbers in the sample are 
reliable. 

If a test for the difference between means has yielded a large or 
middle P, it does not merely fail to disprove the null hypothesis that 
the true means are equal; it furnishes affirmative evidence that the means 
are substantially equal. If the numbers on which the test is based are 
large, the evidence will have convincing weight; otherwise not. Con- 
trariwise a low P points affirmatively toward the alternative that the 
means are unequal. It is the merit of some kinds of tests that they 
indicate unequivocally the specific alternative toward which they point.§ 
Such are tests for the difference between means or the difference be- 
tween variances or tests for skewness. Other tests such as the frequency 
x? or some applications of the analysis of variance do not have this 
characteristic. In Table II is presented an experience of mortalities 
following certain operations with and without the use of a vaccine for 
the prevention of peritonitis. Four tests are given for the “null hypothe- 
sis” that the true mortality rates are identical for patients with and 
without vaccine: (1) the probability of getting as many differences in 
the favorable direction as found; (2) the appropriate P for the x? test 
of the four-fold table constituted by the totals; (3) the Fisher test of 
combining the value of x?= —2In Pz; (4) the summation of the x? and 
degrees of freedom for the separate operations. The resulting P’s are 
considerably different. In terms of the usual rationalization, each of 
these tests is equally valid for testing the null hypothesis. If the null 
hypothesis were true, that is, if the vaccine were ineffective and the 
mortality for any operation were the same whether the vaccine were 
used or not, the appropriate limiting value of each test function would 

5 Elsewhere I have suggested that those tests are ones which in principle can be stated alternatively 
and equivalently in terms of an estimate and its confidence limits. Joseph Berkson, “Comments on Dr. 


Madow’s ‘Note on Tests of Departure from Normality’ with Some Remarks Concerning Tests of Signifi- 
cance.” This JourNAL 46: 539-541, December 1941. 
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occur only infrequently—one just as infrequently as the other. But the 
tests are differently sensitive to the presence of different alternatives. 
In terms of the Neyman-Pearson formulation they have different 
powers for any particular alternative, and hence are likely to give 
different results in any particular case. How blind is the procedure of 
doing some test of significance, when there is no knowledge at hand as 
to whether it is likely to show a significant result or not show one, no 
matter how importantly different the facts may be from the hypothesis 
tested. The importance of this consideration is underscored when we 


TABLE II 


MORTALITY RATES FOR OPERATIONS WITH AND WITHOUT USE OF VACCINE: 
TESTS OF SIGNIFICANCE OF DIFFERENCES 















































Vaccine No vaccine 
Type of : i Mortality 
egemation Hosp‘tal deaths Hospital deaths difference, 
Operation Operations per cent 
Number | Per cent Number | Per cent 
A 107 2 1.9 142 4 2.8 —0.9 
B 28 3 10.7 60 9 15.0 —4.3 
Cc 21 3 14.3 34 5 14.7 —0.4 
D 21 4 19.0 34 8 23.5 —4.5 
E 47 3 6.4 45 4 8.9 —2.5 
F 21 1 4.8 26 2 wan —2.9 
Total 245 16 6.53 341 32 9.38 —2.85 
Test P 
1. Signs 0.016 
2. Total difference mortality 0.11 
3. Combination of P’s—Fisher 0.91 
4. Summation of x? and D.F. 0.98 


realize that in practical applications the failure to show a significant 
result will be taken to corroborate the null hypothesis. It is an impor- 
tant but neglected task of mathematical statistics to investigate what 
alternatives are particularly pointed to by specified findings with dif- 
ferent tests. 

I should like to see the development of investigation of the finding of 
middle P’s. I am not ready to say what this should be or just what it 
would lead to. But this is an example of what I mean. With the develop- 
ment that we now have, which emphasizes the low P’s, we find such 
statements as the following in the literature, and it is typical of the 
essential procedure in many fields in which statistical tests are applied. 
A standard curve for estimating dosage from mortality has been estab- 
lished with its confidence zones, from a first set of data. A set of data for 
another drug is to be used for estimating the potency of a second drug. 
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But realizing the possibility that the standard curve may not be applic- 
able any more, the author counsels the use of some controls to see 
whether the standard curve still applies for the first drug. He says, 
“When the controls have been shown to agree with the standard of the 
regression line by the appropriate x? or ¢ test, the first curve can be 
used.” Now what is meant by this is that if the test does not show a 
low P, the curve can be used, which is to say that if the test shows a 
middle P, the curve will be used. It should be clear on consideration that 
if there is a real discrepancy of a given size between the present condi- 
tions and the curve, a P which is not low will result with small numbers, 
while with the same discrepancy a low P will result if the numbers are 
large. The use of the suggested rule could easily be disastrous if drugs 
were standardized on the basis of it and small numbers were used. 
Investigation should be made which could result in a rule not such as 
just given, but rather of the following kind: “If the control is tested 
with data including so and so degrees of freedom and if the test results 
in a P of this amount or higher, the curve may be accepted as stable.” 








THE STATISTICAL WORK OF THE LEAGUE OF 
NATIONS IN ECONOMIC, FINANCIAL 
AND RELATED FIELDS 


By CuHar.es K. NicHo.s 
Economic, Financial and Transit Department, League of Nations 


T WILL COME as a surprise to some to learn that the Economic, Finan- 
i] cial and Transit Department of the League of Nations is maintain- 
ing the bulk of its work in Princeton, New Jersey, with certain basic 
tasks still being executed in Geneva. It is a tribute to the perseverance 
of its members as well as to the foresight and generosity of the member 
governments still contributing to the support of the League, and to 
certain private institutions! in the United States. Largely, the activi- 
ties are a continuation of the vital fact-finding functions performed in 
Geneva, but in addition special studies on issues likely to be contro- 
verted in the post-war period are being prepared. 

The chief consideration governing the decision to maintain this or- 
ganization seemed to be that its efforts would be most useful in settling 
the great problems following the peace. The world then will be vastly 
different, and without up-to-date information the difficulties of making 
sound decisions in economic matters would be increased. Furthermore, 
what was learned during the inter-war period can be evaluated and 
crystallized into a positive plan for economic collaboration in the fu- 
ture. Also, the net-work of contacts and associations with government 
personnel can be at least partially maintained so that in the post-war 
period a more effective job of collecting data and promoting under- 
standing in an informal way can result. 

Before discussing the history of the economic and financial work it 
might be wise to recall briefly the wide scope of problems dealt with by 
the technical sections of the Secretariat assisted by advisory expert 
committees. Health, child welfare, opium control, housing, nutrition, 
land and sea communications besides economic and financial questions 
were carefully investigated in order to furnish the Council and the 
Assembly of the League, as well as the governments of the several 
states, accurate and meaningful information. In these tasks some of the 
Sections created a respectable body of statistical data, and inter alia 

1 The Department came to Princeton in September 1940 in response to a joint invitation extended 
to the Technical Services of the League by Princeton University, The Institute for Advanced Study and 


the Rockefeller Institute for Medical Research. The facilities of the two first mentioned academic 
institutions are being utilized for the work now carried on in Princeton and the Rockefeller Foundation 


is helping through a liberal grant. 
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exercised considerable influence on the methods employed for their 
collection and publication in the separate countries. 


HISTORY OF THE ECONOMIC AND FINANCIAL WORK 


Almost at the outset the Council of the League saw the necessity of 
maintaining a Committee for the consideration of economic and finan- 
cial problems. Accordingly, as an outcome of the Brussels Conference 
in 1920, representatives in close touch with economic and financial 
ministries or central banks in many countries were chosen to meet 
regularly and take up questions referred to it by the Council. The early 
work centered upon reconstruction, largely in financial matters, and 
disorganized and newly-formed states were afforded invaluable assist- 
ance in reorganizing and organizing their financial and monetary af- 
fairs. Throughout the twenties this work was a dominant interest. 
Later, in 1935, a Fiscal Committee was formed to handle special prob- 
lems of taxation and public finance. Gradually the Economic and Fi- 
nancial Committees became more active with inquiries into matters 
of industrial, economic and commercial importance and it was under- 
taken in diverse ways to bring governments together to discuss com- 
mercial policy. The World Economic Conferences of 1927 and 1933 
were notable as efforts of these Committees to promote the ratification 
of agreements to improve international economic relations. These ef- 
forts were continued throughout the inter-war period but the emphasis 
shifted somewhat from attempts to secure ratification of general agree- 
ments to the procedure of promoting more limited accords among na- 
tions of similar interests, and also toward endeavors to influence na- 
tional policies through focusing attention on problems common to 
many regions. Thus, by designating a sub-committee on the Mitigation 
of Economic Depressions, world experience was pooled and expert 
appraisal was given to the relative effectiveness of different measures 
introduced. 

This work on depressions is continuing, although most of the Com- 
mittees’ activities have been suspended. Their past labors are available, 
however, in reports and memoranda? which afford to economists and 
statesmen a body of factual information and a historical record to serve 
as a guide to the formulation of international economic policy in the 
future. 

The Economic and Financial Committees were appointed by the 
Council of the League, and were composed of civil servants and bank- 

2 See for instance the two Tinbergen reports on Statistical Testing of Business-Cycle Theories 


(L. of N. publications 1938.II.A.23 and 1939.II.A.16) and Professor Haberler's Prosperity and Depres- 
sions (L. of N. publication 1939.II.A.1). 
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ers. Their reports were prepared for submittal to the Council and 
thence were transmitted to governments. Distinct from these Commit- 
tees has been the Economic and Financial Section of the permanent 
Secretariat, which embodied the important Economic Intelligence 
Service to be dealt with below. The Economic and Financial Section, 
made up of salaried experts engaged by the Secretary-General, was 
formed at the very first and has functioned throughout with a consist- 
ent purpose but a flexible program. Largely anonymously, its members 
have been responsible for the collection and analysis of economic infor- 
mation for the use of the Economic and Financial Committees, business 
men and bankers, economists, journalists and statesmen. Its purpose 
has been to observe and interpret world economic and financial events, 
and record a chronology of them in serial publications. In this task one 
of the chief objects has naturally been to assemble statistical informa- 
tion, which brings us to a consideration of the vast work which has been 
quietly carried on by the body now operating in Princeton, and cur- 
rently designated as the Economic, Financial and Transit Department. 


THE STATISTICAL WORK 


As early as 1920 the League Council took up the question of eco- 
nomic, financial and social statistics through the instrument of an In- 
ternational Statistical Commission composed of members of such exist- 
ing organizations as the International Institute of Statistics, the Inter- 
national Labor Office, The International Institute of Agriculture, and 
the International Chamber of Commerce. The Commission recom- 
mended that existing organs separate from the League should be util- 
ized to furnish statistical information. A minority report, however, 
suggested also that a permanent organ be set up in the Secretariat for 
this task, and such became the case. It became then the duty of the 
Economic and Financial Section to institute a service to collect and 
publish economic and financial statistics. All fields were to be covered 
except agricultural and social statistics which the International Insti- 
tute of Agriculture and the International Labor Office could furnish. 
It was also deemed advisable to consult the International Institute of 
Statistics thereby drawing on the accumulated technical knowledge of 
that body’s experience in statistical theory. 

Confronted with the task of collecting such diverse information from 
political divisions using widely different procedures, it was soon seen 
that international comparisons could only be made if the national 
statistics were available in comparable form, and represented not dis- 
similar phenomena measured by comparable methods. At first the work 
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was confined to the publication of the Monthly Bulletin of Statistics, and 
to the preparation of memoranda on special problems of trade, tariffs 
and finance. This brought the Section in close contact with government 
statistical offices and offered an opportunity to suggest methods for 
improving national statistics and for presenting them so as to obtain 
better international comparability. It was necessary, however, con- 
stantly to suggest and request, and without authority to insist, the 
work could not progress as desired. Nevertheless, it was possible in 1927 
to commence publication of the Statistical Year Book embodying data 
included in the Monthly Bulletin, but enlarging on them. 

It appeared, after several years’ effort to induce governments to co- 
operate, that some official means would be necessary to secure this co- 
operation. Hence, in 1927 the Economic Committee recommended to 
the Council that a conference of government representatives be called 
to conclude an International Convention on Economic Statistics. Ac- 
cordingly, in 1928 delegates from 40 countries and representatives of 
several Institutes concerned with international statistics were as- 
sembled to consider the agenda prepared by the Economic and Finan- 
cial Section. The resulting Convention was ratified by 26 states and 
bound the Contracting Members to publish certain classes of eco- 
nomic statistics according to principles agreed on at the Conference and 
incorporated into the Convention. 

This represented an unique step toward international cooperation in 
technical fields and furnished some official backing for the task of the 
Economic and Financial Section. And in fact it became less difficult 
from the date of that Conference to obtain statistics compiled according 
to similar principles and published in comparable form. 


THE COMMITTEE OF STATISTICAL EXPERTS 


Perhaps the most important result of the Conference was the creation 
of the Committee of Statistical Experts, provided for in Article 8 of the 
Convention. To this body, appointed by the League Council and con- 
stituted of specialists chosen on the basis of special competence, were 
delegated the tasks of further developing sound principles for the com- 
pilation of national statistics, and of examining the problem of interna- 
tional comparability. 

Working in close connection with the Economic Intelligence Service, 
the Committee of Statistical Experts met in 1931 and yearly from 1933 
to 1939 and succeeded in considering virtually the entire field of eco- 
nomic statistics, and from 1935 managed also to inquire into some prob- 
lems relating to financial data. The results of their work are presented 
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in full reports on each session. Partial reprints containing the Commit- 
tee’s main recommendations on specific topics are available in a series 
of Studies and Reports on Statistical Methods. 

The work of the Committee is so extensive it is impossible to cover 
each phase of it here. However, some indication of its importance can 
be realized by considering, as an example, what was done in the field of 
statistics on international trade. The Committee established by the 
middle of the thirties what is known as the Minimum List of Commodi- 
ties for International Trade Statistics. Many countries, adhering to the 
convention of 1928, agreed immediately to use the list, while others fol- 
lowed in a few years. By 1939 it was possible for the Economic Intelli- 
gence Service to publish trade statistics for about 30 countries which 
had been compiled according to the minimum list. Thus, for the first 
time, truly comparable trade statistics were at the disposal of states- 
men, economists and business men. 

Likewise, with indices of foreign trade and industrial production, 
statistics on the gainfully-occupied population, housing, capital forma- 
tion, balances of payments, tourism and several other topics the Com- 
mittee has established principles which governments have shown in- 
creasing willingness to apply by adapting their national statistics to the 
basic recommendations of the Committee or by publishing supplemen- 
tary tables drawn up in conformity with its Standard Classifications for 
purposes of international comparison. 

The work of the International Labor Office deserves mention here as 
its activity in the field of social statistics complements that of the Eco- 
nomic Intelligence Service, and the two bodies have been closely asso- 
ciated. By similar methods the Labor Office has promoted the wide use 
of sound principles in compiling sociai statistics and has influenced the 
publication of internationally comparable information. The Economic 
Intelligence Service relies upon data collected by the International 
Labor Office in the fields of employment, unemployment, wages, migra- 
tion, etc. 

PUBLICATIONS 

In its current statistical work the Economic Intelligence Service has 
naturally benefited from the advice of the Committee of Statistical Ex- 
perts. The results are evident in the improvement and extension of the 
publications of the Service. The Statistical Year Book and the Monthly 
Bulletin of Statistics have been described as source documents designed 
for government and public use. Their chief advantage is the conveni- 
ence of finding in a single volume information for every area of the 
world; and because of this convenience governments are spared consid- 
erable expense. 
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The two publications carry similar tables although the Year Book is 
more extensive. It includes data on population by age groups and occu- 
pation; births, marriages, and deaths; mortality rates, expectation of 
life, fertility and reproduction; employment and unemployment, and 
wages. The purely economic information deals with statistics of pro- 
duction, indices of primary and industrial output, and stocks; data on 
transport; and data on trade. Financial statistics cover currency, bank 
deposits, budget accounts, public debt, prices, exchange rates, discount 
rates, bond yields and capital issues. 

Other volumes are based on the immense amount of statistical infor- 
mation collected by the Intelligence Service, but are also interpretive. 
International Trade in Certain Raw Materials and Foodstuffs, World 
Production and Prices, Review of World Trade, Balances of Payments, 
Money and Banking are all included in this group. The World Economic 
Survey, published each year since 1932, is devoted entirely to summariz- 
ing important economic happenings throughout the world. The first 
issue grew out of a special study on economic depressions conducted 
in 1931 and 1932, and the yearly issues supply a coherent record of eco- 
nomic events throughout the thirties. 

These special memoranda are important in particular because they 
are the first attempt to use the raw material of national statistics to 
make a new kind of raw material for the study of world problems. In 
each sphere the volumes have drawn together data and compiled in- 
dices for the summarization of the economic trends of the world as a 
whole. These documents are accepted as important contributions to the 
study of world economic problems, and represent an unique methodol- 
ogy. 

It is also significant that the entire range of economic phenomena 
is scrutinized by a unified group of economists. World problems are 
thus brought into full perspective by a comprehensive and coherent 
body of data of which the parts are interpreted by competent experts. 


THE WAR AND AFTERWARDS 


In a wartime world the activities of the League naturally have been 
reduced. The importance of maintaining the technical work has been 
recognized, however, and the Economic Intelligence Service continues 
to function. The Year Book and the Monthly Bulletin are being pub- 
lished in spite of the increased difficulty of obtaining information. The 
regular works on trade, production and prices, banking, finance, etc., 
have been replaced by briefer chapters on these topics in the World Eco- 
nomic Survey. The last Survey, issued in 1941, deals of necessity largely 
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with problems of war economics, as will the issue to be published in the 
fall of 1942. 

The work of the Department relative to problems covered by docu- 
ments such as the Review of World Trade and World Production and 
Prices has not ceased with the cessation of these publications, however. 
Data are being assembled and special memoranda such as Raw Mate- 
rials and Foodstuffs, published in 1940; Europe’s Trade, 1941; the Net- 
work of World Trade; War-Time Rationing and Consumption; and 
Money and Banking 1940/42 (the last three recently published or to be 
published shortly) all cover significant happenings in the respective 
economic and financial spheres. They contain statistical materials and 
represent authoritative international sources of economic and financial 
data carefully sifted and digested. 

Most of these documents form part of a new set of studies concerned 
with post-war reconstruction. By analysis of the problems and economic 
events leading up to and following the first World War, it is being at- 
tempted to anticipate, also in the light of information collated by the 
Economic Intelligence Service concerning trends during the inter-war 
period, what measures will likely be most effective in preventing a recur- 
rence of the difficulties following the last war. In this work the statistics 
assembled by the Intelligence Service are performing the function they 
have been assigned—namely, they are the base upon which the ana- 
lytical work is grounded. In this respect, then, the labors of this League 
organization to improve national statistics and to secure international 
comparability may prove valuable in settling the reconstruction prob- 
lems which will be faced by a post-war world. 

It is probably true that an orderly international system does not rest 
entirely upon such technical achievements. The fulfillment of psycho- 
logical and deeper social needs is no doubt more fundamental in the 
process of attaining peaceful international relations. However, in criti- 
cal periods the existence of a workable technique may be of first impor- 
tance in promoting the greater end. At the next world peace conference 
when the idea of an international organization will undoubtedly enjoy 
unprecedented popularity the presence of a coordinated plan for eco- 
nomic cooperation may prove decisive. 





















































THE MARKET FORECASTING SIGNIFICANCE OF 
MARKET MOVEMENTS 


By L. C. WitcoxEeNn 


S THE SEER studies his crystal, so too thousands of traders study 

their charts of the price movements of both stocks and commodi- 

ties. They are the courageous exponents of the philosophy that, “If 

you want to know what the market is going to do, study the market.” 

It is the purpose of this study to analyze the actions of certain markets 
in order to determine their inherent forecasting possibilities. 

To establish the characteristics of market behavior, it is necessary to 
select moving definite time intervals of market action and to determine 
the probable actions in the immediate subsequent moving intervals. 
This statistical analysis yields specific “forecasting” criteria for these 
specific time intervals, which may be tested comparatively. From these 
results the criterion for maximum trading profit may be determined and 
its possibilities appraised. 

The raw data employed in this study are the daily “highs” of the 
U. 8. Steel Common stock; wheat May futures and cotton October 
futures. The “highs” were selected arbitrarily rather than the “lows,” 
which would have served equally well. Daily figures were chosen to 
permit analysis, based on short time intervals, to give the greatest 
volume of data and to make possible more precise testing of the criteria 
than would be possible, for instance, by the weekly “highs.” 

The question of what subsequent market action will follow prior 
market action may be expected to depend upon the lengths of the prior 
and subsequent intervals. The amplitude of the subsequent market 
movement may be expected to bear some relation to the amplitude of 
the prior movement. Thus, this study begins with the selection of prior 
intervals, subsequent intervals and the statistical determination of the 
relative market movements. 

The first of the pertinent intervals for which U. S. Steel stock was 
analyzed is the 1-1, that is a one-day-prior interval with its subsequent 
one-day interval. Other intervals are designated as the 2-2, 4-4, 8-8, 
16-16, 32-32, 48-32 and 64-32, each pair of figures representing the 
prior and subsequent intervals in market days. 

The data selected for the purpose of this study were the daily “highs” 
of the U. 8. Steel Common stock prices from January 1, 1922, to Janu- 
ary 1, 1932. This stock was deemed suitable because of its broad and 
continuous market action throughout this period, during which time it 
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was reasonably free from so called “manipulation.” In this ten year 
period there were approximately 2900 market days. 

First the percentage change for each of 2900 market one-day inter- 
vals (day to day) was determined. From these a series of frequency 
charts was made for successive brackets of percentage change from the 
prior day (one-day-prior interval). For instance, a frequency chart was 
made of all one-day changes following one-day percentage changes of 
from zero to 0.5 per cent. Percentage movements which were in the 
same direction were considered positive and those in the opposite 
direction, negative. From the frequency charts the probable error was 
computed from the formula .6745,/z*. In this study the resulting 
probable error of each case is considered as the forecasting criterion. 

For these one-day intervals, seven brackets of prior interval percent- 
age changes were selected and frequency curves were made for each. 
The probable errors are spotted on Chart I. 


CHART I 


INTERVAL PROBABILITY CURVES OF 
U S$. STEEL COMMON STOCK 
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PAST INTERVAL MOVEMENTS IN PERCENT 





A figure has been computed for the probability of the total observa- 
tions and is noted in Table I. The value of this figure, which will be 
designated as the Summary Probability Characteristic, is .286 per cent. 
It will appear later that this characteristic is important in making 
comparisons with the forecasting values of different criteria. 

In Chart I, a probability curve is shown fitted by eye to the com- 
puted points. It is significant to note that one-day market movements 
of less than 0.7 per cent indicate probable reactions the following day. 
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Movements of greater amounts indicate that on the following day the 
-movements will be in the same direction. In the aggregate the more 
numerous minor reactions practically counterbalance the movements 
in the same direction. In spite of this fact, the summary probability 
characteristic, which is based on the dynamics of the movement (X 
squares instead of X’s) is of appreciable magnitude and in the same 
direction. 

In the same manner the data for various intervals have been an- 
alyzed. Curves are fitted to the computed points for intervals of 2-2, 
4-4, 8-8, 16-16, 32-32, 48-32 and 64-32 and are included in Chart I. 

Note the change in the characteristics of the successive curves. The 
1-1, 2-2 and 4-4 are similar in that minor movements during the prior 


TABLE I 


THE SUMMARY PROBABILITY CHARACTERISTICS FOR U. 8. STEEL COMMON STOCK, 
WHEAT MAY FUTURES AND COTTON OCTOBER FUTURES FOR VARIOUS INTERVAL 
PROBABILITIES, USING THE FORMULA .6745 Vz? 











For U. S. Steel For wheat For cotton 
Interval Common Stock May futures October futures 
8.P.C. 8.P.C. 8.P.C. 

1-1 . 286 -431 

2-2 .455 .554 

4-4 -783 -904 

8-8 1.414 .674 1.14 
16-16 2.24 2.06 2.43 
32-32 3.32 —2.64 3.76 
48-32 2.98 
64-32 2.34 





interval are followed in each case by probable reversed movements. 
For each of the next several intervals, there appears a change in which 
the market action of all amplitudes is followed by market movements 
continued in the same direction. This is true for the curves 8-8, 16-16 
and 32-32. Then on the next two curves—the 48-32 and 64-32—inter- 
vals revert back to the type of the three minimum intervals, that is, 
the 1-1, 2-2 and 4-4 curves. 

This is truly interesting. It shows that for the 1-1, 2-2 and 4-4 
periods, a minor cycle exists with amplitudes of something less than one 
per cent. Thus, while true, the fact is of trifling significance. 

The reactive cycle does not again begin to appear until the intervals 
of 48-32 and 64-32 are reached when it is again qualified. In these cases 
only market movements of less than 8 and 12 per cent forecast a 
reversal (i.e. indicate a probable negative future movement), while 
those of greater amplitudes forecast a continued movement in the same 
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direction. Here, as in the 1-1, 2-2 and 4-4, the reverse movements 
practically counterbalance the continued movements. How important 
this defined movement is will be brought out by further analysis. 

The principle by which the forecasting characteristics of U. 8S. Steel 
Common stock has been analyzed, undoubtedly has wide application. 
It is used here to further the understanding of free markets. 

The characteristics of the wheat May futures market were investi- 
gated, using for data the daily “highs” for the nine years January 1, 
1921, to January 1, 1930, inclusive, in which there were approximately 
1900 market days. The forecasting curves, for several intervals an- 
alyzed are shown in Chart II, where they are compared with the curves 
of the U. S. Steel Common stock. 


CHART II 
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In the same manner the cotton October futures market was analyzed, 
the data used being the daily “highs” for the ten year period of January, 
1921, to December, 1930. In this period there were approximately 1850 
market days. The curves for the intervals 8-8, 16-16 and 32-32 are also 
shown on Chart II. The important point to notice is that for short 
periods the forecasting curves of the several markets are markedly 
similar. It is quite possible that this is a general characteristic of all 
free markets. 

Attention has previously been called to the Summary Probability 
(Forecasting) Characteristic of U. S. Steel Common stock for the 1-1 
intervals. This characteristic, which it should be recalled, is the 
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weighted average of points on the probability curve, has been com- 
puted for steel, wheat and cotton for various intervals in each. For 
comparison, their amounts are shown in Table I. 

Two things are strikingly brought out. First, the fact that, for the 
shorter intervals, the values of the characteristics are approximately 
the same for the different markets. This may indicate that fundament- 
ally all markets are governed by the same law. Second, that for U. S. 
Steel Common, the value of the characteristic decreases after the 32- 
day-preceding interval, an indication that this may be the limiting 
period for which such criteria should be used for practical forecasting. 

The various probability curves are statistically correct for anticipat- 
ing the probable stock market action of U. 8. Steel Common stock, and 
as they are specific, it is possible to test their comparative merits. 
Given the market action up to any date, current or past, it is possible 
to consider the probabilities as indicated from each of the curves as 
exact forecasts. The day to day progressive forecasts through any past 
period will give the same results as if the forecasts were made currently. 
Obviously the stock should not be bought or sold unless the probable 
movement is greater than the round turn cost of completing the trans- 
action. It is equally obvious that once a commitment has been made, it 
should be maintained until the forecast is reversed again to the extent 
of the round turn cost. 

The present (1942) percentage round turn cost is variable, decreas- 
ing from 4.4 per cent, when the price is $10 a share, to approximately 
0.4 per cent when the price is $300 a share. These are the extreme per- 
centages which a forecast price change must exceed before a commit- 
ment should be profitable. This is, therefore, the general criterion for 
commitments, either for buying or selling for all forecasting intervals. 

The specific criteria for buying and selling are obtained by the appli- 
cation of these general criteria to the various interval probability 
curves. For instance, when the stock is at 100, the required price change 
to cover commitment costs is 0.7 per cent. With the 8-8 interval, this 
change is practicable when the stock during the prior interval has 
changed 1.4 per cent. Thus, at that price level 1.5 per cent is the critical 
point for making a commitment. 

The practical application of this method in actual tests will now be 
considered. The hypothetical market trading here carried through is 
a comparatively rapid, inexpensive, yet a correct way in which to test 
trading theories. 

The procedure is simple and will be illustrated in some detail by the 
test of the 1-1 interval criteria. Reference has already been made to the 
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daily percentage changes in the Steel Common. These percentages 
were originally tabulated on the work sheet in a column adjacent to the 
stock price. Beginning with January 1, 1922, the eye runs down this 
column until January 20, when the price change of plus 2.63 per cent 
exceeded 2.00 per cent, which at the market level was required to meet 
the round turn cost of the transaction. The stock was assumed to have 
been bought as of that date at 88-1/8. It was not until June 19 that a 
minus change of more than 2.00 per cent occurred, the change amount- 
ing to 2.40 per cent. The stock was assumed sold at 99-2/8, the “high” 
for that day, and a short position taken at that same price. On October 
8 a positive change of 2.12 per cent occurred and the market position 
was reversed, this time at the “high” for the day of 102-2/8. This 
position was maintained until the end of the year when the price was 
107-6/8. 

Thus, for this year’s trading the results were a gain of 11-2/8 points, 
a loss of 9-2/8 points with the stock held at further loss of 0-4/8 points, 
a gross profit of 1-4/8 points. The costs of these transactions were 32, 
70, and 72 cents, a total of $1.74, giving the net result for the year of 
$0.24 loss. This includes the brokerage which would have been charged 
if the account had been closed. 

It was observed that for the ten year period there was a gross profit 
of 189 points, a trading cost of 96.33, leaving a net profit from the 125 
transactions of 92.67. That averages only $9.00 per year per share. The 
rate of profit was quite meager and the year to year record was quite 
erratic. Actual net losses were sustained in four of the ten years, and in 
two years amounted to as high as $45.00 per share. 

While the hypothetical market trading demonstrates the theory, it 
also proves the 1-1 criterion was anything but a practical one to employ 
in trading. Hypothetical accounts were determined in the same manner 
for the other intervals, whose probability curves are shown in Chart I, 
the summaries of which are shown in Table II. The gauges of practi- 
cability, of course, are the items of net profit and the consistency of the 
profits throughout the decade. 

Note well the trend of total net profits in the Decade Summary. As 
the criterion intervals are increased to 16-16, the total net gains in- 
crease to a maximum, after which they decrease. This is in harmony 
with the indications of the Summary Probability Characteristics in 
Table I. Moreover, as the same intervals were increased, the total of 
the annual net losses was decreased until the 32-32 criterion interval 
yielded a minimum. It must be concluded from these facts that the 
most satisfactory trading criterion will be in the region of the 16-16 and 
32-32 intervals. 
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This conclusion prompted a further investigation of the trading pos- 
sibilities of these two criteria in confirmation. The procedure was but 
slightly different than with the single criterion. The first commitment 
was made when both criteria, concurrently, indicated that the stock 
should be bought or sold. The position was maintained until both 
criteria, concurrently, indicated that the position should be reversed. 


TABLE II 
DECADE SUMMARIES OF HYPOTHETICAL TRADING 











Total annual Total annual Total Total number 
Intervals net losses net gains net gains of commitment 
in points in points in points reversals 

1-1 92.67 187 .32 92.67 125 
2-2 85.17 247 .66 162.49 174 
4-4 41.14 317.39 276.25 169 
8-8 23.44 295 .34 271.90 222 
16-16 33.82 382.49 348 .67 183 
32-32 11.39 297 .98 286.59 110 
48-32 62.57 173.11 110.54 157 
64-32 128.99 129.95 0.96 87 
16-16 with 15.63 251.12 235.49 65 


32-32 in confirmation 





Table II shows that with this criterion, the total net gains were 
decreased from the maximum of 348.67 points to 235.49 points, the 
total annual net losses were also decreased from the minimum of 33.82 
points to 15.63 points. While these tests are too crude to be the basis 
of exact judgments, the writer feels that these confirming criteria must 
approximate the most satisfactory one that it is possible to use as a 
basis for market trading in U. 8. Steel Common stock. 

From a statistical standpoint, the results obtained in the foregoing 
are not fully conclusive. The analysis has the inherent fault that the 
test covers the same period from which the curves themselves were 
derived. 

This is entirely satisfactory for the short interval forecasting curves, 
such as the 2-2, 4-4 or perhaps even the 8-8 intervals. It is quite pos- 
sible, however, that with the longer intervals, accidental cycles have 
given a character to the 16-16 and 32-32 interval probability curves 
that would not be duplicated in another period. 

For this reason, a more detailed investigation has been made of the 
forecasting merits of these two criteria in confirmation. The period for 
which hypothetical market operations were executed includes the years 
1918 to 1921, the originally analyzed period of 1922 to 1931, and also 
the years 1932 to 1940, each inclusive, the three periods of four, ten 
and nine years each, making a total of twenty-three years. This long 
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period should exclude the possibility of accidental successes and yield 
results in which great confidence may be placed. 

While the commitment procedure was the same as previously em- 
ployed, the accounting procedure was more detailed. The amount of 
trading account was computed at the completion of each commitment. 
Furthermore, in the following commitment, the full amount of the 
trading account is assumed to be employed, all stock bought outright, 
or if the commitment was short, the amount sold was given a coverage 
equal to the price of the stock at the time of the sale. In both cases, for 
accurate accounting, this necessitated computing the extent of the 
commitments to a ten thousandth of a share. 


CHART III 


| A CS AS CS CA 2 


HYPOTHETICAL MARKET ACCOUNT WITH 

16-16 € 32-32 CRITERIA IN CONFIRMATION 
MARKET PRICE US. STEEL ~~, 
ACCOUNT HOLDINGS IN SHARES N WN WV 
VALUE OF ACCOUNT IN DOLLARS o—o—o 


PRICE OF STOCK AND VALUE OF ACCOUNT 


NUMBER OF SHARES IN ACCOUNT 
oe 
$38 


— 
°o 





The account was opened with $100.00 and the first commitment was 
on January 8, when the stock was bought at a price of 95-7/8. The 
round turn brokerage and taxes amounted to $64.00 for 100 shares so 
that 1.0361 shares were purchased at a per share cost of $96.51. The 
stock was sold March 5 at 91-2/8 for a loss of 4-5/8 points and a net 
loss, including brokerage and taxes on the 1.0361 shares, of $5.46. The 
account then stood at $94.54. This full amount was employed in the 
short sale on the same day at 91-2/8, which after deducting for broker- 
age and taxes permitted a sale of 1.0282 shares. The commitment was 
covered April 18 at 94-5/8 for a second loss of 3-3/8 points, and a net 
loss, including brokerage and taxes on the 1.0282 shares of $4.1900. 
This brought the new balance down to $90.35. 
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Such was the procedure followed for the twenty-three years. The only 
departure was the rendering of balances at the end of each year for 
charting purposes. In Chart III are shown the records of the main 
items of the account, the stock prices, the extent of the holdings and 
the amounts of the account. 

Chart III warrants close examination for it summarizes the entire 
study. The first year a loss of 23.85 per cent was incurred, while a 
period of eight years elapsed before the loss was finally recouped. For 
three more years the results were practically neutral. During the next 
decade there was a spurt in which the account was multiplied nearly 
fifty fold, only to be reduced by a third in the next and final two years. 

During the ten year period 1922 to 1931, inclusive, the account 
increased from $84.45 to $171.16, an increase of but 7.4 per cent com- 
pounded annually. This is a clear indication that the period from which 
the criterion was established had little to do with the success over the 
entire 23 year period. Over this period, the account increased from 
$100.00 to $3368.77, or at the rate of 16.5 per cent, compounded an- 
nually. 


This entire analysis permits making four important conclusions. 

1. The interval probability method of analysis of market action 
demonstrates there are no well defined cycles in U. S. Steel Common 
stock, or in the wheat and cotton futures market. 

2. Forecasts of U. 8. Steel Common stock based on the probability 
curves of intervals up to the 32-32 day intervals all yield net profits in 
various amounts. 

3. Maximum profits are obtainable when the forecasts are based 
upon the action of the market over an interval of about 32 market days. 

4. Even the best interval probability forecasting results are so 
erratic from year to year, that the usefulness of this method of market 
forecasting is seriously impaired. 








THE USE OF INVERSIONS AS A 
TEST OF RANDOM ORDER 


By A. C. RosanpER 
War Production Board 


T CAN BE SHOWN that the inversions in position of n objects or magni- 
I tudes 1, 2,3, ..., for the n/ possible permutations are distributed 
in a family of symmetrical frequency distributions with moments which 
are functions of n only. In counting inversions of position the natural 
order 1, 2,3, . . . , m is used as the criterion. All possible inversions for 
two and for three objects are as follows: 


2 objects 3 objects 
Number of Number of 
Permutation inversions Permutation inversions 
12 0 123 0 
21 1 132 1 
213 1 
231 2 
312 2 
® 3 1 3 


These lead to the following frequency distributions which can be ex- 
tended to include n objects or magnitudes: 


Number of n=2 n=3 
inversions Frequency Frequency 
6 1 1 
1 1 2 
2 2 
3 1 
Sum (n/) 2 6 


The general rule to follow in counting inversions is to take each rank 
and count how many lower ranks follow it; the sum of all such counts 
is the number of inversions for that permutation. 

In the inversion distribution the expected number of inversions per 
permutation, the arithmetic mean, is 
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n(n — 1) 
4 


while the variance is 
n(n — 1)(2n + 5) 


72 





The total number of inversions is n! n(n—1)/4 while the area of the 
frequency histogram is n/. 

It can be shown further that this symmetrical single-peaked distri- 
bution approaches quite rapidly the normal curve as n increases in- 
definitely. Even when n is 6 the correspondence is quite close. In this 
respect and in several others the distribution is analogous to the point 
binomial. This means that no large error will be committed if we assume 
that the arithmetic means from a large number of random samples are 
distributed as a normal probability curve with a mean of M and a 
variance of o?/N where N is the size of the sample and @? is the popu- 
lation variance indicated above. 

These distributions give us a pattern which can be used to test some 
hypothesis of order. The hypothesis provides the basis of counting 
inversions, and the frequency distribution allows us to make inferences 
relative to the departure of the data from the hypothesis. Hence these 
inversions distributions may be taken to give an operational definition 
of random order; deviates of these distributions are measures of the 
departure of the data from random order. 

As an exploratory technique, the method of inversions appears to 
have some merit. Its use, however, is based upon ranked data which 
represent in general a loss of information. It assumes that the various 
permutations of order in the data are equally likely. It does not allow 
for tied ranks. Furthermore it is imperative that the hypothesis of 
order to be tested is independent of the data. Even with limitations 
there is a number of problems to which these distributions are applic- 
able. 


APPLICATIONS 


The Randomness of Tippett’s Numbers. Ten sets of Tippett’s four- 
digit numbers were selected, in groups of 25, from the first page of his 
table of random numbers. The expected number of inversions per per- 
mutation for 25 orders is 150, the variance is 458, and the standard 
deviation is 21.4. The results are as follows: 


1 See M. G. Kendall, Biometrika, June 1938; and George B. Dantzig, Annals of Mathematical 
Statistics, September 1939. The writer first developed and used the method of inversions in 1937 inde- 
pendent of these two investigators. 
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Deviation of 


Set Mean =—-M Mean of cumulative mean 
number = o cumulated sets from M=150 
1 101 —2.3 101 —49 
2 174 1.2 138 —12 
3 173 2 149 — 1 
4 170 0.9 155 5 
5 132 —0.9 150 0 
6 168 0.9 153 3 
7 125 —1.2 149 — ] 
8 139 —0.5 148 — 2 
9 179 1.4 151 1 
10 177 1.3 154 4 


The mean of the 10 sets or samples is 153.8; since the expected mean 
is 150, the deviation is 3.8. Is this deviation to be expected on the basis 
of sampling fluctuations? The standard deviation of these means based 
on a sample of 10 is 


21.4 6.77 
“=” x A 


On this basis the deviation of 3.8 is 3.8/6.8 or 0.56 standard deviations 
above the expected mean. Assuming that the means are distributed 
approximately as a normal probability curve for samples of 10, such 
deviations have a high frequency of occurrence. Hence we infer that 
this sample of 10 could represent a population of random numbers. As 
additional evidence we notice that no one of the ten inversion values 
deviates as much as 2.5 standard deviations from the mean of 150; four 
deviates are negative while six are positive. All this is evidence that 
according to the criterion of inversions these groups of 25 four-digit 
numbers represent random arrangements. 

The Randomness of Measurements in Quality Control. Two hundred 
and four measurements given by Shewhart? were tested in groups of 
51 and also in a single group of 204 ranks. Since there were quite a 
number of tied ranks, these were broken by assigning the ranks in- 
volved at random to the values in question. 

For the four groups of 51 ranks the expected number of inversions is 
637.5 while the standard deviation is 61.6. The results are as follows: 


2 Walter A. Shewhart and W. Edwards Deming (editor), Statistical Method from the Viewpoint of " 
Quality Control, Washington, D. C. The Graduate School of the Department of Agriculture, 1939, pp. 
32, 90. 
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Expected 
Mean number or 4—M 
Set = Mean M =—M o 
1 539 637.5 — 98.5 —1.60 
2 528 637.5 —109.5 —1.78 
3 603 637.5 — 34.5 —0.56 
4 712 637.5 74.5 1.21 


The greatest deviation from the expected number is 1.78 standard 
deviations which is not an uncommon deviation to find. There seems 
no particular reason to reject any of these sets as representing a non- 
random order, on the basis of our criterion of inversions. 

When we come to test the 204 measurements as a single sample we 
find a different situation. For 204 ranks the expected number of inver- 
sions is 10,353, while the population standard deviation is 489. By 
actual count the total number of inversions is 8,954 or 1,399 below the 
expected value. In terms of standard deviations this is —2.86 units, a 
deviation which occurs only once in about 500 times based on the nor- 
mal probability distribution. Examination of the measurements indi- 
cates why the number of inversions falls so far below the expected 
number based upon random order. From observation 17 to observation 
148 inclusive there is a downward trend of the values; from observation 
149 to the end all measurements are at a high level; only 8 out of 56 
measurements fell below 4,500. Above the 148th measurement we find 
42 measurements which rank above 102, the mid-rank, and only 14 
rank below. This preponderance of large measurements at the upper 
end of the series accounts for the relatively low number of inversions. 

These tests indicate an interesting point: that within a non-random 
series there may be random sets, while within random series there may 
be non-random sets. Certainly whether one obtains a test favorable to 
the hypothesis of randomness, on the basis of this criterion, depends 
on where one starts in the series and where one ends. Presumably if 
one had not 204 measurements but many times this number, providing 
that there was no better statistical control, he would obtain no better 
indication of random order. 

What this indicates is that one may obtain a test of random order 
by the mere connecting together of two or more sets of non-random 
sequences, and vice versa. Another point those in sociology and eco- 
nomics will observe is that this sequence of measurements in the elec- 
trical laboratory appears to be strikingly similar to the order of the 
magnitudes so commonly found in economic time series where any 
type of statistical control is out of the question. 
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The Randomness of Economic Time Series. In the data from Shewhart 
we found that the parts gave a test favorable to random order, whereas 
a test of the total gave a test favorable to the hypothesis of non-random 
order. We turn now to an example from an economic time series where 
just the opposite is the case. The consumption of sulphuric acid by vari- 
ous industries in the United States during the past 13 years takes the 
following ranks: 


1928 7 1932 13 1936 6 
1929 3 1933 12 1937 2 
1930 5 1934 10 1938 9 
1931 11 1935 8 1939 + 

1940 1 


For 13 ranks the expected number of inversions is 39, the standard 
deviation is 8.2. In the above table the total number of inversions is 
50 or 11 above the mean, or about 1.4 standard deviations above the 
mean. This is not an unusual deviation to find; hence one may on the 
basis of this criterion consider this a random order. An examination of 
the last 9 years gives different results. For this number of ranks the 
expected number of inversions is 18; the standard deviation is 4.8. 
From the data we obtain 32 inversions, a deviation of 14 or about 2.9 
standard deviations. One is tempted with such a deviation to be skepti- 
cal of the randomness of the measurements involved. 

Furthermore we might test the hypothesis that the order of the 
values from 1932 through 1940 was the same as the order of the Federal 
Reserve Board index of industrial production (1925-1939 = 100). The 
order of corresponding years of the two sets of data is the same; the 
number of inversions is therefore zero. This is additional evidence 
which calls in question the hypothesis of randomness. 

The Randomness of Guessing. The probability of selecting by chance 
the correct order of n magnitudes is 1/n/. In an examination question 
in which 3 elements are to be arranged in some sequence, the probabil- 
ity of getting the correct sequence on the basis of complete ignorance 
is 1 in 6 which is quite high. On the other hand, if the number of ele- 
ments is increased to 5, the chance of guessing a correct sequence is 
reduced to 1 in 120. 

This principle could be used in the extra-sensory perception experi- 
ments of Rhine’s in which the numbers of inversions in the order in 
which the first » numbers are presented could be used as a measure 
of departure from chance selection. By using a sequence of 10 magni- 
tudes the probability of selecting by chance the correct order is reduced" 
to 1 in 3,628,800. 
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The method of inversions can be extended to provide an operational 
basis of measuring discrimination of a sequence of objects or measure- 
ments. On the basis of this theory the expected number of inversions is 
a measure of no discrimination, or guessing, and departure from the 
expected number is a measure of ability to discriminate. Hence it is 
possible to give a “score” of discrimination to each of the n/ possible 
permutations. 

Random Order and Correlation.’ In case two sequences are uncorre- 
lated, the inversions of order of one relative to another will be distrib- 
uted according to the foregoing distributions. In other words, zero 
correlation is associated with the expected number of inversions, M. 

On the other hand, if two sequences are correlated the number of 
inversions will tend to be small for a high positive correlation, and to 
be large for a high negative correlation. 

These relationships suggest the following definition of a correlation 
function r; in terms of the number of inversions z: 


22 4x 
r=1-—-—=1 —-——_ 
Sa n(n — 1) 
where 2, is a constant, the maximum number of inversions for a 
given n. 

When z, the expected number of inversions, is equal to zero, r;=1; 
when it is equal to M which is always one-half z,, r;=0; when it is 
equal to 2m, 75= —1. 

The value of this simple linear function lies in the fact that the dis- 
tribution of xz, for any given value of n, is known and can be used to 
evaluate any value of r;. Hence there is no need of dealing with the dis- 
tribution of the correlation coefficient. For values of n larger than 10 
one can make use of the normal probability curve as an approxima- 
tion to the inversion frequency distribution, and test whether the given 
value of x falls within the 5 per cent or 1 per cent levels. 

Suppose we have two sets of ranks, A and B, with the set A ranked 
from 1 to 20, with the corresponding rank of set B below that of set A, 
as follows: 


SetA 12345 678 910 11 12 13 14 15 16 17 18 19 20 
SettB 116723101412 8 5 9 14 18 19 15 17 13 16 20 
Li 1055115003 1004314412 0 «0 0 


* This method appears to have been published first by M. G. Kendall, Biometrika, June 1938, 
although the writer arrived at a similar index and similar conclusions independently. The writer prefers 
to use inversions and their distributions rather than correlation because the former can be associated 
with testing some hypothesis of order which covers a wider range of problems than does the correlation 
coefficient. Kendall uses the standard error of r but this is unnecessary since r is simply a linear function 
of the number of inversions. 
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Once the two sets of ranks are arranged in this manner, the next step 
is to count the total number of inversions. This is done by beginning 
at the left end of the B series, taking each number in turn, and counting 
the number of ranks to the right which are smaller than the one under 
consideration. The numbers of inversions corresponding to the individ- 
ual ranks of set B are shown in the row designated “z;.” The total 
number of inversions is the sum of this row, or 43. It does not make any 
difference whether we use set A as the standard order to count the in- 
versions in set B, or vice versa; the number of inversions is the same. 

When n is 20, the expected number of inversions is 95, the 5 per cent 
point is 70, and the 1 per cent point is 59. In other words, a value less 
than the obtained value of 43 could be obtained by chance less than 
once in 100 times. On the basis of this criterion we would reject the 
hypothesis that there is no relation between these two ranks. Then if 
we desire to translate this inferred relation into an index of correlation 
we can substitute the x value of 43 in the equation for r;. If we do this 
we obtain a value of 0.55 which is positive because z is less than the 
mean; it would be negative if x were greater than the mean. 









































CORRELATION ANALYSIS BY MARGINS 


By E. J. Broster* 


ORRELATION ANALYSIS by the use of margins (or differences) is not 
C entirely a new idea except in so far as little seems to have been 
done to exploit it, or, anyway, to demonstrate its potential value as a 
statistical device.! Like any method—or may I say any other method 
—of correlation analysis worthy of the name, the use of margins has 
special value in the determination and even the final solution of at least 
one particular functional type of problem. In short, it fills a gap. But 
it can also be successfully applied to other kinds of problems, as I know 
from experience—and with advantage too, where the method of least 
squares is beyond the mathematical understanding of those who may 
have to make use of the resulting coefficients. 

The particular gap which the marginal method fills concerns the 
solution of problems in one independent variable in which graphing 
indicates the existence of an equation of some unknown but higher 
degree than the first, that is, of an equation of the type: 


Y=k+aX+bX?4cX*+ --- 


By using margins, we can obtain the correct or best-fitting degree of 
equation for final solution by least squares, or we can continue the 
analysis by margins to arrive at reasonably accurate results. 

Having fitted a representative curve to the coordinates either free- 
hand or on the basis of group averages, we proceed by recording a 
number of readings from it. For a reason that will become clear later, 
these should be taken in order of magnitude of the independent vari- 
able, X. In Table I, lines (a) and (b) give the readings of a simple but 
typical example. The rest of the table down to line (1) shows the method 
of extracting margins up to the point where the degree of the equation 
is determined together with the coefficient, or an approximation to it, 
of X", where n is the degree of the equation. 

In this case n=3. We know this because Y/X of the third series of 
margins in line (1) are the same, or at least the trend shown in the first 


* The following quotation from the author's letter of transmittal should be of interest to many of us. 

“Most of the preliminary experimental work on the use of margins I carried out in my leisure 
evenings during the heavy air raids on the London area last year when I found it difficult to concentrate 
on the complexities of the kind of investigations I specialise in.” Editor 

1 See H. S. Will, “On Fitting Curves to Observational Series by the Method of Differences,” Annals 
of Mathematical Statistics, May 1930; 8. 8. Bose, “Relative Efficiency of Regression Coefficients Esti- 
mated by the Method of Finite Differences,” Sankhyd: The Indian Journal of Statistics, Part 4, 1938; 
and M. Ezekiel, Methods of Correlation Analysis, Table 13, p. 46. 
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TABLE I 











(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 





Determining the degree of equation and the coefficient of X” 


Readings from graph:— 


(a) xX 1 2 4 5 5.5 
(b) 7 9 27 71 153 285 373.5 
Margins—First series, to eliminate k:— 

(c) ~ _— 1 1 1 1 0.5 

(d) Bg _— 18 44 82 132 88.5 

(e) 7's” _ 18 44 82 132 177 
Second series, to eliminate aX :— 

(f) =” — — 2 2 2 1.5 

(g) a — == 26 38 50 45 

(h) gael? ga —_— —_ 13 19 25 30 
Third series, to eliminate bX?:— 

(i) - = -- a 3 3 2.5 

(k) - sini — = ia 6 6 5 

(1) gaa? Saal _— —_ _ 2 2 2 

The equation is therefore of the third degree and the coefficient of X is 2. 
Determining the coefficient of X"—! (i.e., X*) 
(m) Y-2x3 7 11 17 25 35 40.75 
Margins-—First series, to eliminate k:— 

(n) Xx’ —_ 1 1 1 1 0.5 

(o) Y,’ _— 4 6 8 10 5.75 

(p) Y3' /X’ -— 4 6 8 10 11.5 
Second series, to eliminate aX :— 

(q) a — —_ 2 2 2 1.5 

(r) fi — _ 2 2 2 1.5 

(s) at > ee —_ _— 1 1 1 1 


The coefficient of X? is therefore 1. 





and second margins, lines (e) and (h), is eliminated. In practice, identi- 
cal values for any Y/X series is unusual, so that the criterion is elimina- 
tion of trend, which indicates the need for setting out the original read- 
ings in order of magnitude of X. | 

The calculation of the first marginal series is clear. Line (e) is inserted 
to show the need for calculating the Y’/X’ figures. It will not only rarely 
be found possible to arrange for each of the X’ margins to equal unity 
but even be advisable to set out the X readings so that the first series 
of margins increase or decrease progressively. The reason for this is 
that as a result of graphic errors, the need for rounding off in the cal- 
culations and the accumulating error these involve with each successive 
extraction of margins, the final trendless Y/X series—when it is 
reached—will sometimes scarcely be recognized as such owing to the 
fluctuations it contains. When the readings taken from the graph 
result in variations in the X margins, these margins can be plotted 
against the corresponding Y margins. If they do give a trendless Y/X 
series, then the coordinates must necessarily lie about a straight line 
which passes through the origin. The slope of this line would be equal 
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to the coefficient of X", a fact which it is necessary to keep in mind if it 
is proposed to complete the analysis by the marginal method. 

After the first extraction, the difference in the means of arriving at 
further margins should be noted. The Y margins are simply derived 
from the preceding Y/X series, while each figure in the X marginal 
series is equal to the difference between the two extreme figures invol- 
ved in the original X readings, a difference which equals the sum of the 
corresponding figures in the first series of X margins. In Table I for 
instance, the figure of 2.5 in column (7), line (i), is equal to 5.5—3 in 
line (a), and to 1+1+-0.5 in line (c). It is generally more convenient in 
practice to use the latter, and if necessary to apply the former as a 
check on the calculations.? 

It is clear from the small number of marginal values left over at the 
end that a larger number of readings than those used in the example 
are necessary. The minimum is about ten, but a great deal depends 
upon the dispersion of the coordinates in the original diagram and the 
degree of rounding off necessary in the calculations. Dispersion always 
gives rise to difficulties of curve fitting, but even supposing the graph 
approximates closely to the type under consideration, the larger the 
number of readings taken for the purpose of the analysis the more 
accurate are the results likely to be. 

If we propose completing the analysis by margins without resorting 
to least squares then the next step is to determine the value of b by 
deducting cX* from the Y series and repeating the process. This is 
shown in lines (m) to (s) of Table I, where it is found that b=1. Further 
analysis on the same lines would produce the complete equation: 


Y=5+ X + X? + 2X. 


2 Each final Y /X margin is in effect the solution for the coefficient of X", by simultaneous equations, 
of the sets of observations involved. For the second degree equation 
Y =k+aX + bX? 
the solution for b from the three first sets of observations Xo, Yo; X:, Yi; X:, Ys; by simultaneous equa- 
tions, may be written 
Xi — XX 
Xi — Xe 
X: — X: 
———  (X.2 — Xo?) — (X? — X:’) 
Xi — Xo 
This is identical to the solution obtained by the method of margins, viz.: 
Y: — ¥: y -—Y¥, 
b saad Xi — X Xi — Xo 
x” Bem Xe 
We can regard each solution for 6 as an estimate. From the three first sets of observations, let this 
estimate be be. Then we derive b: from X:, Yi; Xs, ¥2; Xs, Ys; bs from Xs, Y2; Xs, Ys; Xs, Ye; and so on. 
Perfectly correlated data would give bs=bi=b.... 


(¥: — Yo) — (Ys — Fi) 
d= 
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This kind of marginal analysis may be applied to any curve in any 
quadrant, so that the constants k, a, b, c, . . . can be integral or frac- 
tional, positive or negative numbers. The indices are always necessarily 
positive integers. In the example all readings and margins have positive 
signs. Negative signs however often appear at some stage of an analysis, 
when the ordinary rules apply. Every analysis indicates the sign of 
each constant. Table II shows the first stage of an analysis involving 
three quadrants, since the complete equation is 


Y= —10+X + 3X? — X*. 


It should be kept in mind that the correct form of equation stating 
the relationship between any two variables may be of an entirely 
different nature. It is not surprising therefore that marginal analysis 
sometimes fails to yield results. Where this is so, its inexpedience or 
that of the manner of its application soon becomes evident in an in- 
creasing, instead of a decreasing, upward or downward trend in the 
successive Y/X series, or in a violent reversal of trend. 


TABLE II 








Determining the degree of equation and the coefficient of X" 


Readings from graph:— 


x -3 —2 -1 0 1 2 3 
Y 41 8 -—7 —10 -—7 -4 -7 
Margins—First series :— 
x’ _— 1 1 1 1 1 1 
Y’ =Y’/X’ —_ —33 —15 —3 3 3 _ 
Second series :— 

= — —_ 2 2 2 2 2 

: — -- 18 12 6 0 ah 

ee? gd _ -- 9 6 3 0 —3 
Third series :— ° 

x= _ — — ; 3 3 3 3 

ied a —_ 3 -3 -3 —3 

jae Se — _— — —] -]1 -1 -] 


The equation is therefore of the third degree, the coefficient of X* being 1 and its sign negative. 





On the other hand, a good fit by marginal analysis does not provide 
any guarantee that the equation obtained states the true relationship. 
This method, in common with all other methods of correlation analysis, 
invariably gives empirical formulae except possibly where the correct 
form of equation is known and used. 

A reversal of trend in the Y/X series does not necessarily condemn 
the method as being unsuitable for the particular case. It should 
generally be regarded as a signal that the first stage of the analysis has 
been reached. It is then a matter of choosing between the pre-reversal 
and the first reversal series, and the X and Y series corresponding to the 
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one with the smaller trend should be plotted in the way described above 
in order that a final decision can be made. 

Where a violent reversal of trend takes place, the analysis should be 
discarded, but not necessarily the method. A good fit may sometimes 
be obtained on the assumption that the best-fitting equation involves 
negative indices which need to be converted to positive indices. For 
instance, it may happen that a good fit can be obtained on the basis of 


1 
Y =o—+h+0X+4-:- 
at + 6X + 


which, converted for marginal analysis, becomes 

YX =a+t+kX + dX?+-:--- 
For the purpose of the calculations, YX would be regarded as the 
dependent variable, and treated in exactly the same way as Y in the 
example above. 

Another useful application of the marginal method lies in the partial 
solution of problems in two variables, especially where the number of 
observations is too small, or the nature of the dependent variable’s 
relationship to one of the two independent variables is too uncertain, 
to permit of the use of least squares without preliminary tests. 

Table III contains some figures from the Board of Trade’s Produc- 
tion Index. Columns (1), (2) and (3) give respectively the years, the 
general index numbers and the gas and electricity index numbers. The 
output of gas and electricity depends principally upon industrial pro- 
duction as a whole; but there are also some other factors which are 
responsible for the obvious upward trend in time independent of in- 
dustrial production, and which can therefore be represented by the 
catch-all factor, time, statistically represented by the series of years. 




















TABLE III 
Board of Trade Marginal series 
Production Indices* 
Year General Gas and 
electricity 
t P 4 Y t’ = i 67 .6t Y —67.6t 
(1) (2) (3) (4) (5) (6) (7) (8) 
(1924 =1000) 

1927 0 1,068 1,197 —_ — _— 0 1,197 
1928 1 1,055 1,260 1 —13 63 68 1,192 
1929 2 1,118 1,358 1 63 98 135 1,223 
1930 3 1,032 1,387 1 —86 29 203 1,184 
1931 4 937 1,424 1 —95 37 270 1,154 
1932 5 933 1,470 1 -4 46 338 1,132 
1933 6 986 1,562 1 53 92 406 1,156 
1934 7 1,108 1,700 1 122 138 473 1,227 





* Source: Statistical Abstract for the United Kingdom, 1940, p. 307. 
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Some discretion must be used in deciding the order in which the data 
be set out. In all, or nearly all, cases where time is taken as a factor 
bearing a naturally or logarithmically linear relationship to the depend- 
ent variable, the best arrangement is chronological. If the appropriate 
form of equation is Y =k+at+/(X), the equation to the two marginal 
series as in columns (5) and (6) will be Y=a+F(X). Unless the X 
series has a distinct and fairly smooth trend in time, the latter will 
usually reduce to Y=a+bX, since the X’ margins are derived from a 
disorderly arrangement of X. 

The coordinates of the two series in columns (5) and (6) lie rather 
more closely about a straight line than usual, and the fitting of the 
graph by personal judgment should yield reasonably accurate results. 
The equation thus obtained is 


Y = 67.6 + 0.483X. 


According to this method, therefore, the gas and electricity index, 
based on 1924=100, rose annually at an average rate of 6.76 points as 
a result of extraneous factors, while on the linear assumption it varied 
by 0.483 directly with every change of 1 point in the general production 
index. 

After removing the effect of extraneous factors from the gas and 
electricity index, we are left with a problem in one independent vari- 
able, in which graphic testing can be carried out. The adjusted index 
numbers are shown in column (8) of Table III. Plotted against the 
index numbers of production in general, they give a set of coordinates 
which lie about a straight line, thus proving the linear assumption 
underlying the figure of 0.483. 

Without the facts revealed by this analysis, we could scarcely feel 
safe in applying the method of least squares to obtain the figures we 
require, that is, the two coefficients, and not merely a ready means of 
estimating the gas and electricity index number for any given year and 
for any given general production level. On the basis of the data in 
columns (1) to (3) of Table III, least squares gives the equation 


Y = 68.3¢ + 0.468X + 69.9. 


The figures of 68.3 and 0.468 approximate closely to the results we 
have already obtained. The question therefore arises whether a final 
solution by the least squares method would always justify the work it 
involves. Personal preference, experience, and the degree of accuracy 
needed must provide the answer. But whether we insist on the least 
squares solution or not, my own feeling in the matter is that the adop- 
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tion of facts instead of assumptions as a working basis is more impor- 
tant even than the calculation of correct error margins about which 
so much has been written in recent years. 

But the marginal method has practical limits. Some linear problems 
in two independent variables may be solved by the use of margins 
without recourse to graphs. It follows that it may also be possible to 
solve problems in three independent variables by margins with the 
aid of graphs. But such cases in my experience are so rare that the 
marginal method would in any circumstances scarcely be worth con- 
sidering in problems of more than two independent variables except 
where the basic data are known to possess a very high degree of corre- 
lation. 

All I have used as margins in the examples above are the successive 
differences. In linear problems one could take instead all the deviations 
from the lowest observation, the next lowest, and so on. In my ex- 
perience, which has been derived from numerous experiments, the 
addition of any other margins to the successive differences merely 
increases the complications and the work involved without yielding 
results of any greater degree of accuracy. Successive differences seem 
to provide the right kind and the right number of margins. 

There is, perhaps, one objection to successive differences. They 
account for all observations twice with the exception of the two ex- 
tremes for which they account only once. Bose’ pairs the observations 
and thus extracts alternate differences which he calls successive dif- 
ferences, and so avoids this objection. But he also loses some 50 per cent 
of the marginal observations and thereby increases the degree of error, 
which in certain conceivable circumstances might be very serious indeed. 

Bose is concerned entirely with linear problems in one independent 
variable. If it were sound practice to calculate the regression coefficient, 
as he does, by dividing the sum of the Y margins by the sum of the X 
margins, the marginal method would long ago have displaced all others. 
But such is not the case. The sum of successive differences is equal to 
the difference between the two extreme observations, so that if Bose 
had not in effect omitted every other successive difference he would 
have fitted the curve by reference to two sets of observations only, as 
he does in his Range method. Yet this would usually give far less in- 
accurate results than alternate differences. 

There is something more to be learned from Bose. His half-range 
method (in which he divides his basic data into two equal groups, 
extracts the differences between the two first sets of observations in 


3 Loc. cit., Appendix I, p. 346. 
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the groups, the two second, and so on, and divides the sum of the Y 
margins by the sum of the X margins) merely provides a roundabout 
way of determining the regression coefficient from two-group averages; 
for the sum of the half-range margins in each series is equal to the 
difference between the sums of the basic observations in the two 
groups.‘ 

For simple linear correlation, the method of margins is superfluous. 
It has no advantages even over the graphic and group-average meth- 
ods, and has one or two serious disadvantages. As I hope I have shown, 
however, if it is employed with care and with the aid of graphs to avoid 
summation, it undoubtedly provides a simple means, in some of the 
more complex problems, of determining empirical forms of basic 
equation where pure logic fails, as it so often does in the case of eco- 
nomic and financial relationships, to give the correct forms. 


4H. S. Will (loc. cit., pp. 165-166) like Bose, appears to have overlooked this fact. His formula for 
linear series makes provision for any number of steps, k, which, if it happens to be a factor of the number 
of observations, n, gives the same solution as n/k-group averages except where k =1, when the solution 
is based entirely upon the two extreme sets of observations. Will also provides formulae for calculating 
the regression coefficients of other types of equation by the method of margins, but their application is 
restricted to problems in which the form of equation is logically determinable. 
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THE STANDARD ERROR OF PERCENTILES 


By W. Duang Evans 
U. S. Bureau of Labor Statistics 


used to demonstrate economic differences between segments of 
populations. This use has been specially marked in surveys of the dis- 
tribution of persons or families according to their incomes. 

Percentiles have much to recommend them. They have a specific 
meaning in terms of population values, and they may be defined in 
terms which are readily understood by a person without technical 
training. Moreover, they are in general somewhat less sensitive to 
errors resulting from certain types of sampling bias than, for example, 
the mean. 

One of the drawbacks to the use of percentiles has been the lack of 
some general method by which the probable error due to chance devia- 
tions in sampling might be investigated. With respect to the median, 
the rule that the standard error is about one and one-quarter times the 
easily estimated standard error of the mean is well known. However, 
this rule is applicable only when the population distribution approxi- 
mates the normal form. It is not valid for the highly skewed and 
flattened distributions in which incomes usually fall. In fact, in such 
cases the standard error of the median is generally substantially less 
than that of the mean. 

The following paragraphs present a method by which the standard 
error of any percentile may be estimated. In the case of large samples 
this requires in general less effort and computation than an estimate of 
the standard error of the mean. For large samples, the method has been 
worked out for the general case of any percentile of a sub-group of a 
population estimated by means of a stratified sample. The results may, 
of course, be readily simplified to apply to less complex systems of 
sampling. The method does not require the assumption that the parent 
population is distributed in any special way. 

The method presented does not result in exact values for the stand- 
ard error of percentiles, except in the limit, but rather in usable and in 
many cases very close approximations to these values. The principal 


I RECENT STUDIES, percentiles of various kinds have been much 


1 It has been noted that high income families tend to be somewhat less cooperative in income and 
expenditure studies than families in the middle ranges. Even though high income families are infrequent 
in the population, a slight bias against them, resulting in a disproportionately low number of such 
families in a sample, may markedly change the position of the estimated mean while hardly altering the 


estimate of the median. 
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aim is to provide the practical statistical worker with a simple means 
of testing the significance of percentiles and the equipment necessary 
to design surveys with preassigned levels of reliability. 

Large Samples. The population is defined as consisting of N sampling 
units or individuals. Among these are A sampling units possessing some 
complex of characters which sets them apart from the rest. The A type 
individuals vary with respect to some additional factor X. This factor 
is of such type and the number A is sufficiently large so that the varia- 
tion is effectively continuous. Corresponding to a preassigned value of 
the character X, say Xz, there are among the A type individuals a 
certain number B, for which the value of X is less than Xz. 

To assist in visualizing the foregoing, we may, for example, let N 
represent the number of individuals in a city. The symbol A may then 
represent the number of persons within the city that are of some 
specific race and sex. These vary according to their incomes (X). There 
are then a total of B individuals of the specified race and sex which 
have incomes of less than, say $1,000 (Xz). 

The population is divided into r independent strata (to carry out 
the analogy of the city, into r distinct geographical districts). We then 
let— 

N;=the number of sampling units in the 7** stratum, 

A;,=the number of A type sampling units in the 7“ stratum, 

B;=the number of B type sampling units in the z7** stratum. 

A sample is selected at random and without replacement in each of 
the strata. With respect to this sample, let—- 

n;=the number of sampling units constituting the sample in the 
7 stratum, 

a;=the number of A type sampling units in the sample nj, 

b; =the number of B type sampling units in the sample n;. 

Let the reciprocal of the sampling ratio employed in the 7“ stratum 
be represented by S;. Then 


S; = N;/ni. (1) 


Estimates of quantities are denoted by primes. Estimates of A and B 
are then defined by the relations 


A'= > S,a; (2a) 


t=1 


B' = > Sb. (2b) 


t=] 


It is evident that E(A’)=A and E(B’) =B. 
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A percentile, Xx, is to be determined for the group of A type indi- 
viduals in the population. Let the percentile be defined by K, which 
represents the proportion of the A type individuals to which the per- 
centile is to refer. Then, for example, K =0.25 for the first quartile, or 
K =0.50 for the median. 

The quantity A is defined by the equation 


= KA —B. (3) 


It then represents the difference between the number of individuals of 
type A in the population with a value of the character X less than Xx 
and the number with a value of the character X less than Xz. An esti- 
mate of A is obtained as follows: 


A’ = >> S,(Ka; — b;). (4) 
t=] 
It may be shown that 
E(A’) = 2) Sini(Kpo, — ps,) = A (5) 
t=] 


where ja, is the proportion or probability of occurence in the 7“ stratum 
of individuals of the A type. The definition of pp, is similar, and both 
may be represented as follows: 


Pa; = Ai/N; Po; = B;/N;. 


We may now note a very important fact. On the basis of any particu- 
lar sample, we may estimate the percentile Xx to be either above or 
below the value Xz. However, the latter can occur only if the frequency 
difference A’ for this particular sample is negative in sign. Then, if we 
can determine the relative frequency of occurence or probability of 
negative values of A’ under any particular set of conditions, we at the 
same time have specified the relative proportion of all times when on 
repeated sampling we will estimate the percentile Xx to lie below Xz. 
This may be expressed by the relationship 


p(A’ <0) = p(X’x < Xz) -{ f(A’)dd’ (6) 


where f(A’) represents the probability distribution of the estimated 
frequency differences. This relationship is true irrespective of either 


population or sample size. 
We may now proceed to examine the form of the distribution of A’ 


in samples of large absolute size. Since the quantity A’ is a linear func- 
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tion of a number of hypergeometrically distributed variates, one may 
at once infer that as the size of the sample on which it is based increases 
while the ratio of sample to population size remains small, the distribu- 
tion of A’ will approach the normal probability distribution as a limit. 
The latter form will give a fair approximation to the probability dis- 
tribution of A’ in the case of samples of even moderate size drawn from 
a large population. 

Under the given conditions, equation (6) may then be rewritten as 
follows: 


P(X’ < Xz) = (Vin) fede (7) 


where 
t= A/aa. (8) 


In evaluating (7), it is necessary to have the variance of the distribu- 
tion of A’ in terms of the population parameters. This is easily obtained 
in the following manner. Since the individual strata are independent, 
it will be sufficient to determine the variance of A,’, where 


A,’ = S;(Ka; om b;) (9) 
because, from (4), 
o%4, = > oy, (10) 
t=] 
From (9) 
E(A;"?) = S7E(K?*a,;? + b,? — 2Ka,b;). (11) 
But it may be shown that 
E(a,*) = nipa? + nipada(Ni — ni)/(Ni — 1) (12a) 
E(b7) = n*po? + nipogo(Ni — ni)/(Ni — 1) (12b) 
E(aibi) = nipa;po; + Nipo da (Ni — ni)/(Ni — 1). (12c) 


Since the variance of any variate is equal to the expected value of its 
square minus the square of its expected value, we may combine equa- 
tions (5), (10), (11), and (12) as follows: 


a7," = > S2n;[K(1 — K) pa; = (1 — 2K) (Kpa, a Pr,) 


i=1 


— (Kpe; — po)*][(Ni — mi)/(Ni — 1]. (18) 
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If the appropriate population parameters are known or can be esti- 
mated, equation (7) can now be evaluated to determine the probability 
that a percentile may be estimated to lie below some specific value 
Xz. It is only necessary to evaluate ¢ and refer to suitable tables for a 
normal probability distribution of unit standard deviation and zero 
mean. The desired probability will be represented by the area falling 
below minus ¢. 

By repeating the procedure for two different values of X, say Xz, 
and Xz,, we may determine the mathematical probability that Xx will 
be estimated to lie between Xz, and Xz. Choosing appropriate values 
of X, we may then reconstruct the probability distribution of X’x over 
any desired range. 

To obviate this awkward procedure, we must learn something of the 
form of distribution of X’x. In the first place, interest centers only in 
the shape of the central portion of the distribution of X’x; the extreme 
tails of the distribution are of little importance. This is equivalent to 
saying that equation (7) will be evaluated only for values of ¢ which 
lead to probabilities of significant size, say for values between —3 and 
+3. Since the variance of A’ decreases with an increase in the sample 
size, it is apparent from (8) that as the sample becomes larger the values 
which may be assigned to X which lead to values of ¢ lying within any 
preassigned interval will cover a shorter and shorter range. In a large 
sample the entire range of such values will be small. Two important 
results follow from this fact. 

First, the population distribution within this narrow range may be 
assumed to be approximately linear, since it was originally assumed 
that the distribution of the A type individuals with respect to X was 
continuous, though of unspecified form. As the size of sample increases 
and the range of relevant values of X becomes narrower, it follows 
that the relationship between A and these values of X approaches 
linearity. 

Second, examine the form of equation (13). The same reasoning may 
be applied here. In this expression only the values of p,; in the various 
strata depend on the value of X, and over a narrow range of values of 
X, the values of the pp; will change but slightly. Over this short range, 
then, the variance of A’ is effectively constant. 

With oc, approaching constancy and A becoming a linear function 
of X, it is apparent from equation (8) that the relationship between ¢ 
and X also approaches linearity in the range over which we are inter- 
ested in determining the form of distribution of X’x. But as this rela- 
tionship becomes linear, it follows from (7) that the distribution of 


















372 AMERICAN STATISTICAL ASSOCIATION: 





X'x necessarily approaches the normal form. In a sufficiently large 
sample, then, the distribution of estimates of any percentile is very 
closely approximated by a normal probability curve. 

If we accept the distribution of X’x as approximately normal, ¢ is 
obviously equivalent to the number of standard deviations of X’x 
represented by the absolute difference between Xx and Xz. 

We may therefore set down 


Cx’; = (| Xx = Xs| o4’)/A. (14) 


This expression may be somewhat simplified. In the first place, equa- 
tion (13) reduces directly to the following form: 


oa: = > s.[Kd — K)A; —_ (1 = 2K)A; 
> — A?2/N][(Ni — n)/(Ni —1)]. (18) 


In line with our previous assumptions, the finite sample factor in the 
second pair of brackets may be taken as approximately equal to unity. 
Now A; represents all individuals of the A type in the 7 stratum, 
while A; represents only those of the A type in the 7“ stratum who fall 
in the relatively narrow interval between Xx and Xz. It is apparent 
that in most cases A; will be quite small relative to A;, and the following 
expression will be a satisfactory approximation to the variance of A’. 


o*4, = K(1 — K) >> S;Ai. (16) 
te] 

Because of the modifying factors K(1—K) and (1—2K), the agree- 
ment between equations (15) and (16) will be least satisfactory in the 
case of the extreme percentiles. However, here again sample size is a 
factor. As the size of sample increases, the range of values of X for 
which an evaluation of (15) may be required will become narrower, 
and accordingly the maximum values of A; which may appear will be- 
come smaller. The second and third terms within the brackets thus 
decrease in importance relative to the first. . 

Finally, we may assume that X is measured in relatively fine inter- 
vals, and choose Xz so that Xx—Xz is equal to one-half C, where C 
is the width of the class interval by which the character X is measured 
and within which the frequencies of the A type individuals are tabu- 
lated. Then A will be very nearly equal to one-half F., where F, is the 
total number of A type individuals in the population in the interval C 
which includes the percentile Xx. Making the indicated changes, and 
incorporating equation (16), equation (14) reduces to 























SE Ae ES Se 








- THe STANDARD ERROR OF PERCENTILES 373 


r 1/2 
ox’, =C | Ka — K)>> S.A] / Pe (17) 
t=] 

This expression is an approximation to the standard error of a per- 
centile. In applying it, then, due care must be exercised that the as- 
sumptions involved in its derivation are not violated. Perhaps most 
important, it should not be applied to extreme percentiles unless the 
sample is large enough to justify the assumption of normality in the 
text following equation (6), and the use of equation (16) as an approxi- 
mation to equation (15). 

In evaluating equation (17) in any particular instance it will prob- 
ably be necessary to use sample values as approximations to the 
population parameters specified. It may well be noted, then, that the 
ratio C/F, is in reality an approximation to the value of the reciprocal 
of the ordinate of the distribution of the A type individuals according 
to the character X at the point Xx. It follows that a considerable range 
of data may be used to estimate the most probable value of the ratio 
in any case where the sample results are somewhat irregular. 

Equation (17) is quite general in form. It may be readily simplified 
to refer to less complex sampling situations. For example, let us assume 
that a percentile referring to the entire population is to be estimated 
on the basis of a non-stratified sample of size n. Equation (17) then 
reduces to 

ox'n = C[K(1 — K)n}"*/f. (18) 


where f. represents the expected frequency of A type individuals in a 
sample of size n in the interval which includes the percentile. If the 
percentile to be studied is the median, equation (18) becomes simply 


oX'o5 = CVn/2f.. (19) 


This will be recognized as the same formula as that given by Yule and 
Kendall, but derived by them by a somewhat different method.* 

Small Samples. In the foregoing, the argument has been limited to 
large samples. In small samples the various distributions which have 
been studied may not be represented with sufficient accuracy by normal 
error functions, and consequently equation (7) and its simplifications 
become invalid. However, a line of attack on the problem of determin- 
ing the probable error of an estimate of a percentile based on a small 
sample is indicated. 

In the first place, since a small sample is seldom a stratified sample, 
we may limit our examination to non-stratified samples. Similarly to 


2G. U. Yule and M. G. Kendall, An Introduction to the Theory of Statistics, London, 1937, p. 384. 
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the terminology previously used, let n, a, and b represent the sample 
size and the number of A and B type sample units in the sample. Now, 
provided that the proportions of the A and B type individuals in the 
population are known or can be estimated, we may estimate the prob- 
ability of occurrence of combinations of particular values of a and 6 in 
repeated samples of size n. From these probabilities we may determine 
the probability of appearance of negative frequency differences and so 
the probability that the percentile Xx will be estimated on the basis of 
a sample of this size to lie below Xz. Repeating this for various as- 
signed values of Xz, we may construct any part of the distribution of 
X’x desired. Since n is small, the number of combinations of a and b 
to be investigated is limited. The labor involved in this procedure, while 
not small, will still usually not be prohibitive. This is especially true if, 
as will many times be the case, all that is desired is a specific test of the 
significance of the difference between the percentile and some other 
single value of X. 

The setting up of a single form summarizing the above procedure is 
difficult and unwieldy in the general case for any percentile. This has 
been done in the special case of the median, but the resulting expression 
is not given in the text because of its complexity and limited usefulness. 
The small sample case must be regarded as bordering on the trivial, be- 
cause in extremely few instances are percentiles used in conjunction 
with samples of very small absolute size. 

Samples of Intermediate Size. In some cases, a sample may be too 
large to permit convenient evaluation by the special procedures sug- 
gested in the text immediately above, but too small to permit the appli- 
cation of equation (17), especially in view of the simplifications incor- 
porated in the latter which are based on the assumption of very large 
sample size. Under these conditions, the procedure suggested in the 
text following equation (13) will usually suffice. This method is most 
useful if all that is desired is a test of the significance of the difference 
between an estimated percentile and some preassigned quantity. 

The validity of the use of this procedure depends upon how closely 
a linear function of one or more hypergeometric variates (depending on 
the type of sampling employed) may be represented by the normal 
probability distribution. In the case of the more central percentiles a 
fair agreement is obtained even with relatively small sample sizes. 

Sample Allocations in Stratified Sarxples. Neyman and others have 
studied the question of the proper allocation of a limited number of 
schedules between strata to produce the most efficient estimate of the 
mean. It may be of interest to apply this same technique to the prob- 
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lem of estimating a percentile. First, it is assumed that the cost of ob- 
taining each schedule is the same throughout the population. The total 
number of schedules to be obtained (total sample size) is represented 
by 7. We then define a function V as follows: 


V = [C?K(1 — K)/F*.]¥ Npa,/ni + L ( > ni — ne) (20) 


t=] t=1 


where L is an arbitrary Lagrange multiplier. The function V is differ- 
entiated with respect to n; and n; and the results equated to zero. Be- 
tween the resulting two equations L is eliminated. From the result, it 
may be shown that to minimize the variance of our estimate of a per- 
centile, the size of sample to be taken in any stratum should be allo- 
cated in accordance with the following relationship, 


n=mJ/NiAi / >, VNiAi. (21) 


i=] 


It is interesting to observe that the allocation of the sample is inde- 
pendent of the particular percentile which is to be estimated. More- 
over, if the percentile is to be determined for the whole population, A; 
becomes equal to N;, and the allocations are simply made proportion- 
ate to the total number of individuals in each stratum. 

Illustration. It was mentioned in the introduction that the standard 
error of the median may be substantially less than that of the mean in 
certain distributions, especially those of an economic character. A con- 
crete illustration of this may be of interest. The accompanying tabula- 
tion presents a distribution of 1,200 families according to their annual 
incomes. This distribution is patterned after a sample obtained by the 
Study of Consumer Purchases in a restricted section of New York 
City. It thus represents what may be found in practice. 


Under $250........ 10 $ 2,500 to $2,999... 157 
$ 250to$ 499... 12 3,000 “ 3,499... 88 

500 “ 749... 26 3,500 “ 3,999... 52 

750 “ 999... 49 4,000 “ 4,499... 33 
1,000 “ 1,249... 89 4,500 “ 4,999... 20 
1,250 “ 1,499... 104 5,000 “ 7,499... 54 
1,500 “ 1,749... 125 7,500 “ 9,999... 15 
1,750 “ 1,999... 1382 10,000 and over.... 18 


2,000 “ 2,249... 126 
2,250 “ 2,499... 90 All incomes....... 1,200 








376 AMERICAN STATISTICAL ASSOCIATION: 


Applying equation (19), the standard error of the median of this 
distribution is estimated to be about $34. The usual methods lead to an 
estimate of the standard error of the mean of $73. The range of uncer- 
tainty of an estimate of the mean based on this sample would then be 
about double the range for the median. 

The difference in reliability may be exhibited in more striking fashion 
in terms of the greater size of sample required to get an estimate of the 
mean having a standard error of only $34. It is readily found that to 
provide such an estimate a sample of approximately 5,500 families 
would be required, or a sample more than 43 times as large as that re- 
quired to estimate the median with the same standard error. 

















PRICES AND WAGES* 


By Water G. Kerm 
Bureau of Labor Statistics 


N APPROACHING the subject of prices and wages, a distinction should 

be made between two different relationships. First, there is the rela- 
tionship between the earnings of a worker and the cost of goods which 
he purchases with his earnings. Then, there is the relationship between 
wages as a cost element in the production of goods and the price of the 
goods. The first of these relationships involves the income of the worker 
as measured, for example, by his average weekly earnings. The second 
involves unit labor costs, of which average hourly earnings are an im- 
portant, but not a precise, indicator. 

This paper is intended to describe the major trends in wages and 
prices and to indicate the various factors which have had a tendency to 
disturb the relationships between wages and prices since the beginning 
of the war in Europe. Lack of time, of course, precludes detailed dis- 
cussion and analysis. 

An analysis of the first relationship, that of changes in workers’ in- 
come as measured by average weekly earnings and changes in the cost 
of living, indicates that since the outbreak of the war the economic 
status of the typical wage earner in manufacturing and mining has been 
materially improved. The cost of living between August 1939 and Oc- 
tober 1941 rose 12 per cent while average weekly earnings advanced 34 
per cent. This amounts to an increase of 20 to 22 per cent in the real 
weekly earnings of factory employees. Since the very outbreak of the 
war this relationship has persisted. Between September 1939 and Sep- 
tember 1940, weekly earnings mounted 7.4 per cent while the cost of 
living in large cities actually declined .2 per cent. Similarly, between 
September 1940 and October 1941, the respective changes were an in- 
crease of 23.7 per cent in earnings and an advance of 9.7 per cent in 
living costs. It is likely that total annual earnings, would show an even 
greater increase because of fuller employment. 

* A paper presented at the 103rd Annual Meeting of the American Statistical Association, New 
York, December 28, 1941. Since this paper was prepared, there has been some shift in the focus of the 
discussion regarding the relationship between prices and wages. The emphasis in this paper is upon 
wages as an element in the cost of producing goods and their consequent impact upon the price struc- 
ture. Since the tremendous increase in the pace of war production which has occurred after Pearl Harbor, 
and the progressive reduction in the volume of goods available for the civilian market, there has been 
increasing concern with the relation of wages to total public purchasing power. During the summer of 
1942, when this paper was going to press, primary emphasis in the formulation of national policy was 


upon this latter phase; upon the possibility that widespread wage increases, by increasing consumer 
demand, might lead to excess pressure upon the price structure. 
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To this observation that the status of the average wage earner has 
been materially improved, there must, however, be added a number of 
reservations: in interpreting these statistics it should be borne in mind 
that a significant proportion of the upward movement in average 
weekly earnings for all manufacturing industries is to be accounted for 
first, by a shift in employment. Under the impact of the defense pro- 
gram, employment in durable goods industries, which pay relatively 
higher wages, has risen more rapidly than in the nondurable goods field. 

The figures just presented are national averages and workers have 
by no means been uniformly affected. In many lines of work the change 
in average weekly earnings has been much smaller than the average; 
for example, compared with the increase of 34 per cent in all manufac- 
turing industries, weekly earnings have changed only 8 per cent in 
newspaper plants, 11 per cent in hosiery manufacturing. Moreover, 
the data for the country as a whole are not necessarily typical of de- 
fense areas where both the advances in earnings and the increase in 
cost of living have been greater than average. 

Much of the increase in average weekly earnings represents longer 
hours of work resulting from the transition from part-time to full-time 
work and more recently the rapid lengthening of hours of work so that 
overtime at high rates is paid. 

Despite these reservations, however, there can be little doubt that 
the average industrial worker is today much better off than he was be- 
fore the war began. 

We can turn now to the second relationship—that between wages as 
an element in costs of production and the prices of goods produced. 
There has been much talk about the role that wage increases have 
played in the advance in prices since August 1939. The figures available 
seem to indicate rather clearly that until the summer or autumn of 
1941 at least, increases in wages have not been the primary factor in 
causing prices to go up. 

A first indication that industrial wages have not been the key factor 
in price increases may be taken from the following figures contrasting 
the per cent of increase among different groups of prices. Whereas the 
all-commodities index of wholesale prices advanced approximately 23 
per cent from August 1939 to November 1941, prices of basic raw ma- 
terials, as shown by the Bureau’s index of 28 basic commodities, in- 
creased 55 per cent over the same period; the index of all raw materials 
has gone up about 36 per cent; semi-manufactured goods have gone up 
about 20 per cent; finished manufactured goods about 19 per cent. The 
durable heavy goods, which are most affected by the defense program, 
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have gone up only about 15 per cent. The index for all commodities 
other than farm products, which advanced 19 per cent, is the one most 
comparable with the data on industrial earnings. 

These figures show that the greatest price increases in this list since 
August 1939 have been among the raw materials. Wages do not consti- 
tute a key factor in the cost of any of these commodities. The greater 
the price advance, the smaller has been the element of fabrication by 
factory workers in this country. The converse is also true. The smallest 
price advance for any of the groups of commodities just cited is that for 
durable manufactured goods. It is in this field of durable goods that the 
largest wage advances have occurred. For durable goods as a whole, 
average hourly earnings rose about 28 per cent between August 1939 
and October 1941, while the corresponding increase in the case of non- 
durable goods amounted to less than 12 per cent. Such increases in 
wages as have been granted obviously have been a minor factor in the 
price advances that have actually occurred. Price increases at whole- 
sale have generally preceded increases in wage rates and have far out- 
run them. 

Wages, of course, constitute only one of the costs of manufacturing 
and hence the selling prices of manufactured goods do not have to rise 
at the same rate as labor costs, merely to cover the cost of any given 
wage increase. When all manufacturing industries are considered to- 
gether, account must be taken of the fact that the output of one pro- 
ducer frequently becomes the raw material of another. Hence, the 
question is: what is the ratio of wages paid to the value added by manu- 
facture? Census data show that wages constitute about 40 per cent of 
value added by all manufacturing operations. On the average, there- 
fore, a 2} per cent increase in selling prices will cover the cost of a 6 per 
cent general wage increase provided that output per man hour is un- 
changed. 

Let us analyze the actual sequence of events. There have been two 
major upswings in prices since the beginning of the war in Europe. One 
occurred in the latter part of 1939 when the index for all nonagricul- 
tural commodities increased 6 per cent from the outbreak of the war at 
the end of August to a peak in December of that year. During this pe- 
riod, average hourly earnings changed hardly at all. There was a gen- 
eral reduction in prices in the early months of 1940, but after August 
1940, when the effects of the American defense program began to be 
felt, prices resumed their upward trend, and have been rising almost 
without interruption ever since. From August 1940 to March 1941, 
prices of all commodities rose on an average by 8 per cent and average 
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hourly earnings rose by 4 per cent. In each of these two periods of rising 
prices and wages from the outbreak of war to March 1941, therefore, 
the advance in prices cannot be attributed to higher wages. 

It was not until March 1941 that important changes in the wage rates 
occurred. The major wage advances in early 1941 really began in 4 
major industries—cotton manufacturing, coal, steel and automobiles. 
In each of these industries it was evident that profits had increased. 
Since that time, there has been a general fanning out of these wage in- 
ereases to other industries. 

In the case of cotton goods, wholesale prices had risen by 13 per cent 
in the 5 months prior to the wage increase (October 1940 to March 
1941) and actually by 43 per cent during March alone. During this pe- 
riod, moreover, prices of raw cotton were advancing only moderately 
and unit overhead was decreasing because of rapidly advancing sales. 
Mill margins on cotton cloth during this period had risen by an average 
of 36 per cent to the highest levels on record. 

The wage increase in cotton manufacturing that began in March 
amounted to 10 per cent in the North and 7 to 8 per cent in the South 
among those cotton mills that made the adjustment. The average ef- 
fect from March to June was to raise hourly earnings in cotton mills as 
a whole by 6.6 per cent. In July 1941, hourly earnings advanced again 
as a result of adjustments to the new 373 cent legal minimum wage of 
the Fair Labor Standards Act, bringing the total increase from August 
1939 to September 1941 to 26 per cent. Assuming for the moment that 
labor costs rose by the same percentage, the added cost could have been 
met by a 6 or 7 per cent increase in selling prices, since direct wages rep- 
resent only about one-quarter of the value of cotton-mill products. 

Not only had prices already riser¥ by 13 per cent between October 
1940 and March 1941, but they continued to advance after March. By 
November 1941, the average price of all cotton goods was 61 per cent 
higher than in August 1939 and 47 per cent above the level of October 
1940. Even though raw cotton prices advanced sharply since February 
1941, the mill margins in October 1941 were 79 per cent higher than in 
August 1939. On this advance, as we have seen, only a very small pro- 
portion could be attributed to a rise in wage rates. 

The wage increase in steel mills and in some of the leading automobile 
factories amounted to 10 cents an hour. This would be equivalent to an 
11} per cent rise in steel wages and about a 10 per cent rise in the auto- 
mobile industry if univerally applied. By July, hourly earnings in steel 
mills had leveled off at 11 per cent above the average for March. A 
number of the new wage agreements in the automobile industry pro- 
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vided for increases of 5 cents or 8 cents in place of the 10 cent advance 
which was granted by General Motors on May 15, and by July the 
average increase for the industry over the level of March amounted to 
8 cents an hour. Since the wage increase in steel mills, the quoted price 
of steel has shown no significant change. Automobile prices were raised 
in October, reflecting in part changes in unit cost resulting from a com- 
bination of factors, shifts to defense production, changes in raw ma- 
terial costs and labor. Looking at the overall picture during the period 
March 1941 to October the price index for nonagricultural commodities 
went up 14 per cent and average hourly earnings increased 10 per cent. 

In interpreting all these figures, it cannot be overemphasized that 
higher average hourly earnings do not necessarily reflect wage rate in- 
creases. Thus, if a man is being paid on a piece basis, he may increase 
his hourly earnings simply by producing more per hour. Similarly, 
higher hourly earnings may reflect increased overtime at premium 
rates rather than a change in basic wage rates. More broadly, higher 
hourly earnings do not necessarily mean rising unit labor costs. The 
cost of labor per unit of output—-that is, per yard of goods or per ton of 
coal or per ton of copper—depends on how much is produced per hour 
as well as on what the worker gets per hour. While hourly earnings have 
been advancing in recent years, so has output per man-hour. Indeed, 
the data at the disposal of the Bureau of Labor Statistics indicate that 
the output of manufacturing industry per man-hour increased about 16 
per cent on the average from 1937 through the summer of 1941. In the 
same period average hourly earnings increased by 17 per cent; hence, 
unit labor costs advanced only 1 per cent despite the wage increases 
since March. In other words, the increase in hourly earnings was offset 
almost entirely by the increase in the amount of goods produced per 
hour. 

So much for the immediate past. We have seen that higher wages 
have not been primarily responsible for the rising general level of prices. 
But at this time many factors are tending to disturb wage-price rela- 
tionships. Up to the present, in many industries, increases in actual unit 
labor costs have been more than offset by lower overhead per unit. 
However, from this point on, there will be definite limits to our ability 
to increase productivity. 

Similarly, we are reaching the limit of our capacity to expand pro- 
duction. There will be numerous shifts from consumer goods production 
to durable defense goods production. The period of increasingly lower 
burden per unit of production is reaching an end. In fact, some indus- 
tries will be forced to curtail production and, depending on the nature 
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of their output, they may not be able to transform their plants satis- 
factorily in order to produce defense goods. Hence, many will be faced 
with the problem of vast curtailment of normal production with the at- 
tendant increases in unit overhead costs. 

These shutdowns may be due to a number of causes, of which the 
most important will probably be shortages of materials. The automo- 
bile industry, the electric refrigerator industry, the washing machine 
industry, in fact nearly all industries producing consumers’ durable 
goods, are going to be seriously affected in this way. Also, many manu- 
iacturers of perishable goods will find that they will have difficulties in 
replacing manufacturing equipment, because of the shortages of metals 
used in making the equipment. 

Certain producers may well be faced with such problems as power 
shortages. Thus, during this last year, production costs in the South- 
ern textile industry increased because of the inability to secure suf- 
ficient power. Then there will be the question of transportation 
difficulties, shortages in certain types of skilled labor, and similar 
problems. 

Thus in certain industries operating at capacity or under limitations 
imposed by the war program further wage increases will inevitably 
bring a demand for higher prices. 

This means in the long run the raising of living costs. Even at pres- 
ent, the large increases in prices in wholesale markets, particularly 
for foods (not due to wage increases) are rapidly filtering through to 
retail markets and are raising the cost of living. There will be many 
new demands for increases in wage rates. In the last war unions occa- 
sionally included in their contracts an escalator clause providing for 
increases in wages with advances in living costs or prices of the prod- 
uct they produced. At the present time, however, unions are generally 
not requesting this provision. Although these are exceptions more often 
they merely use the record of the upward changes in cost of living to 
justify their demands for increased wages. 

Unless careful consideration is given to the ability of concerns to 
pay higher wages without further increases in their selling prices, we 
may well face a cycle of wage and price increases that will become a 
serious threat both to the general welfare of the Nation and to the 
conduct of the defense program itself. 


























THE USE OF TESTS OF SIGNIFICANCE IN 
AN AGRICULTURAL EXPERIMENT STATION* 


By Greorae W. SNEDECOR 
Iowa State College 


HE PURPOSE Of this paper is to recount some of the uses made of 

tests of significance in our research procedure. In order to clarify 
my presentation it is necessary at the very outset to specify rather 
carefully the circumstances in which these tests are applied. 

Our samplings are of two types, the rather intensive sort involved 
in controlled experiments and the more extensive surveys conducted 
by questionnaire, visitation, or the examination of extant records. 
The objectives are the same; namely, unbiased estimate of population 
facts and probability statements based on these estimates. 

The populations sampled are ordinarily described in character- 
istic fashions. Those subjected to experimentation are specified under 
some such caption as “Materials and Methods.” They are often volun- 
tarily restricted by the choice of experimental animals and plants as 
well as by the attempted control of extraneous effects. For example, 
the rat may be selected for testing the difference in weight gain conse- 
quent upon two treatments. The animals used may be taken from some 
highly inbred colony maintained in a carefully controlled environ- 
ment, and the sampling confined to individuals of the same age and 
sex. Thus, the heredity, the environment and the two treatments of 
the chosen animal all enter into the specification of the population. 
The population sampled by inquiry must be described no less metic- 
ulously. The region sampled, the questions asked, the sampling unit 
together with the mode of its selection, the stratification adopted, and 
the date are all pertinent. 

Coming now to the immediate topic of my paper, I remark that 
an experiment is a sampling designed wholly or in part to test some 
null hypothesis. To fix attention, let us think of a field plot trial to 
discover if the application of a specified amount of fertilizer affects the 
yield of a certain crop. The objective is attained if we learn whether 
or not the use of the fertilizer differentiates the sampled aggregate into 
two populations; the one, yields from treated plots, and the other, 
yields from untreated. You see at once that either of two null hypo- 
theses may be suitable. If the fertilizer applied is supposed to cause a 


* A paper presented at the 103rd Annual Meeting of the American Statistical Association in joint 
session with the Institute of Mathematical Statistics, New York, December 27, 1941. 
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more or less durable improvement in the soil, its cost may be charged 
to future gains, and the appropriate null hypothesis is the identity of 
the yields on fertilized and unfertilized plots. If, on the contrary, the 
value of the treatment lies only in its effect on the current crop, the 
hypothesis may be that the difference in yields is exactly sufficient to 
pay the cost of the application. 

With the experimental data in hand the investigator calculates the 
mean difference in yield, together with some appropriate quantity such 
as t, z, or F The probability of this quantity is the so-called test of 
significance. It is really evidence concerning the chosen null hypothe- 
sis, and must be considered as part of all the other evidence accumu- 
lated prior to and during the experiment. In the light of his entire 
experience, including the test of the pertinent hypothesis, the agrono- 
mist then makes a decision about the efficacy of the fertilizer under the 
experimental conditions obtaining. 

Of great utility in our practice is an extension of the experiment just 
described to include comparisons among large numbers of varieties or 
treatments. The data from such an investigation lead to a set of means 
whose variance is comparable with a second variance provided by the 
sampling, commonly designated as experimental error. To test the 
null hypothesis, the ratio of these two variances is calculated and com- 
pared with the tabulated distribution. A large value of the variance 
ratio, F, is looked upon by the experimenter as evidence that his mean 
yields are more widely dispersed than would be expectec in random 
sampling from a single population. If this evidence is compatible with 
that already available he concludes that the higher yielding varieties, 
other characteristics being satisfactory, are worth selecting for further 
trial, or for recommendation to farmers operating in an environment 
similar to that of the experiment. On the other hand, a small value of 
F warns him that the population of yields represented by these varie- 
ties may be undifferentiated. If this is in conformity with other experi- 
ence he may ignore yield as a criterion, freely selecting such varieties 
as may possess some other desirable characteristics. 

As a third example of our use of tests of significance, I have chosen 
another often encountered feature of yield tests; namely, the effect of 
stand; that is, the number of plants per unit of area. This is a variate 
concomitant to yield, difficult to control but easy to measure. It may 
be a variety characteristic, or it may be due to random variation in 
germination and early growth, the distinction being important agro- 
nomically. If the variety variation in stand is found to be significant, an 
otherwise promising kind may be rejected because of poor seed germi- 
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nation. On the contrary, a small F may lead the plant breeder to ig- 
nore stand as a basis for selection, and to evaluate yield by use of ex- 
pected mean values adjusted by the regression of yield on stand. In 
this latter event, he may increase the precision of his evaluation by 
eliminating from experimental error the measured variance attributable 
to the random irregularities of stand. 

Quite a different use of tests of hypotheses is that commonly invoked 
if experimental error rests on the variation of individuals within two or 
more groups receiving different treatments. In order to test the hy- 
pothesis that the means are drawn from a common source, the group 
variances must be pooled. If these batches of variance are samples 
from populations with different variances the F-test doesn’t give un- 
ambiguous information about the means. Hence, the hypothesis of 
homogeneity of variance is set up and tested. An unusually large value 
of chi-square may lead the investigator to use some appropriate trans- 
formation of his data in order that the F-test may give him desired evi- 
dence about the variation of the group means. 

The foregoing tests have been selected partly because of their com- 
mon occurrence and partly because they illustrate some of the types of 
decisions based on experimental evidence. In the extensive type of 
sampling, tests of significance may not be used at all. If an estimate 
of population total is the statistic desired, the appropriate statement 
about probability is the fiducial limits. The effectiveness of the sam- 
pling design is ordinarily examined by studying the efficiency of strati- 
fication and of the size and allocation of the sampling unit. Neverthe- 
less, group means of such extensive samples are often compared, and 
regression together with its linearity must frequently be tested. Hence, 
it is not far from the truth to say that a majority of the steps in the 
examination of our data are guided by tests of some null hypothesis. 

I should like to make it clear that these tests may not be made for- 
mally. Careful perusal of the data occasionally obviates the necessity 
of calculations. Again, since careless design or faulty execution of an 
experiment may yield no estimate of error, the statistician may be 
forced to resort to rough approximations based on knowledge of range 
or coefficients of variation. Naturally, he fortifies his decisions about 
significance with greater than the usual factor of safety. 

You may ask if conclusions and recommendations flowing from an 
experiment are ever based wholly on the probability of the testing 
statistic. I suspect not, since no experiment can yield proof but only 
evidence, and since few investigators design a sampling without an 
extensive background of information and experience. Moreover, there 
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is nearly always evidence gained during the progress of an experiment 
which is not incorporated into the numerical data. Conceivably the in- 
vestigator may come to a final trial with pros and cons exactly bal- 
anced, leaving his decision to the outcome of the experiment. If so, what 
odds would he demand against the null hypothesis to reject it? Greater 
than 99:1, I suppose. More often an experiment is conducted to pro- 
vide a link in some long chain of evidence. I see no necessary objection 
in such case to rejecting the hypothesis on the basis of any odds 
against it greater than 1:1. Except for beginners and those who are 
easily swayed by their emotions, I think there is little merit in setting 
arbitrary values to probability, then decreeing that smaller must lead 
to rejection. The 5 per cent and 1 per cent points are convenient mile- 
stones which the investigator will note in passing, but any probability 
turned up constitutes evidence pertinent to decisions. 

All the circumstances of applied sampling warn us against blind 
subservience to any conventional probability. The distribution of a 
testing function is worked out from assumptions of normality of popu- 
lation, randomness of sampling, independence, etc. This model is 
probably not exactly reflected in any actual sampling. It is known that 
rather generous relaxations of some of the conditions have little effect 
on the probabilities involved. On the other hand, work of a number of 
investigators makes it clear that inaccuracies of several per cent either 
way from the 5 per cent point may not be uncommon. This does not 
disturb the sampler who looks upon the test of his hypothesis as con- 
tributory evidence, though it may be devastating to the worshipper 
of 5 per cent. It is the consulting statistician who finds arbitrary points 
handy because his decisions must be based rather solidly upon the evi- 
dence of the data themselves. 

I am coming to believe that the term, “test of significance” is creat- 
ing more confusion than it resolves. The phrase is not descriptive of 
the logical and experimental concepts involved. Partly for this reason, 
the dictionary definitions of the words blind many to their meaning in 
terms of probability. Perhaps the time has come to state the probabil- 
ity of the testing function as the end-point of the statistical investiga- 
tion leaving the researcher to combine this evidence with that already 
accumulated, then to rest his decision on the whole of it. 

















MECHANIZATION OF STATISTICAL DRAFTING 


By R. von Hun 


URING World War I statistical drafting came into its own. The 
D various war agencies were in constant search for men who were 
familiar with the technique and principles of graphic presentation. 
Needless to say, most of the men who were qualified came from the en- 
gineering profession. Brinton, the engineer, had published his book on 
graphic presentation in 1914, the first of its kind. 

In the years following, the late Knoeppel became one of the foremost 
exponents of graphic production control. He devised large control 
boards which, by mechanical means, served to guide the production 
engineer as to the status of balanced inventory, schedules, actual out- 
put and deliveries. Use was made of the principle of the “Gant Chart,” 
translated into a graphic mechanical set-up by means of movable 
strips, replaceable pins, etc. The devices were kept in the office of the 
supervising production engineer for purposes of immediate control. 
The same method is used by various industries in the present emergency 
to control the flow of production and assembly of implements of war. 

In 1917 there were at the disposal of the graphic analyst only few 
materials which enabled him to perform his work efficiently. Since 
then the situation has greatly changed. New reproduction methods 
have been discovered and others have been perfected. New materials 
such as Zipp-a-tone, acetate overlays, transparent tape, rubber cement, 
scotch tape, floating letter type, photo-letter type, stack type letters, 
the Wrico and the Leroy mechanical lettering sets are now on the 
market. 

Although only indirectly related to the subject matter discussed here, 
mechanization of computing operations should also be mentioned. The 
hand operated computing machine, while still in use, has given way to 
the electrically operated machine. The various models on the market 
have been constantly perfected, keeping pace with the requirements of 
the statistician or accountant. The automatic tabulating machines 
which are essential for the mass production of statistics need hardly be 
mentioned here. Incredible as it seems, each year new improvements 
are made and the older designs are revised. 

At present, as far as the writer knows, there is only one mechanical 
device on the market which eliminates drafting entirely—the so-called 


1C. E. Knoeppel, Graphic Production Control, pp. 477, Ill. New York, McGraw-Hill Book Com- 
pany. 
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Cosmograph—manufactured by the International Business Machines 
Corporation. The device is used to show the flow of a percentage dis- 
tribution.? The other device designed by the writer for the use of the 
United States Government is the mechanical intensity shading map 
which was described in the December 1938 issue of this JouRNAL. The 
writer has recently constructed a crude demonstration model of a 
mechanical bar chart which takes care of percentage bar diagrams 
showing four different component parts, and their respective shadings 
adjustable to any position. 

The advance in technique has, on the whole, been one-sided with 
emphasis on the perfection of materials for reproduction methods. This 
may be termed partial mechanization of statistical drafting. Complete 
mechanization of drafting may be defined as a process by which a final 
chart or map is produced entirely without the aid of drafting instru- 
ments in the same manner as the typist prepares a copy. It is interesting 
to note that H. Gray Funkhouser,’ who has made what is, up to the 
present, the most thorough contribution to the history of graphic pre- 
sentation, did not discover in his researches evidences of mechanization 
of statistical drafting, nor did he mention its possibilities. 

Complete mechanization of statistical charts and maps will never 
answer all the demands set forth by the investigator who makes use of 
graphic presentation; nor can it take the place of individual originality 
which plays a great part in the design of graphs analyzing special situa- 
tions. The United States Civil Service Commission recognizes now that 
graphic presentation and analysis are well-defined professional fields 
requiring special qualifications and training. 

There is, however, a distinct place for a machine which will produce 
simple line charts especially time series which show long term trends. 
The job of plotting a 20-year trend by months for three or more curves 
is a tedious one and belongs, in the writer’s opinion, to the field of 
mechanization. 

What then are the requirements for a statistical charting machine? 
The following are some of the points which seem essential: 

1. Standard 10-keyboard from 0 to 9, similar to the one found on 
the Sundstrand Adding Machine. 

2 The above statement does not refer to mechanical devices used in industry which automatically 
keep graphic records of temperature, gas consumption, variation of speed, barometric pressure etc. The 
roles which these instruments play in ovr industry have been described in a monograph entitled: “Indus- 
trial Instruments and Changing Technology,” by G. Perazich, H. Schimmel and B. Rosenberg, W.P.A. 
National Research Project, Report No. M-1, October 1938, Philadelphia, Pa. 

3H. Gray Funkhouser, “Historical Development of the Graphical Representation of Statistical 
Data,” Osiris, Volume III, Part I, 1937, Bruges (Belgium). Funkhouser’s fundamental work should be 


read by all students interested in graphic methods of presenting statistical data; besides an interesting 
text, it contains an excellent bibliography. 
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2. Adjustable vertical and horizontal movement—the latter designed 
to permit variation in distance between ordinates, as the time scale 
may require. 

3. The most important part of the machine is the printing mecha- 
nism, which controlled by the keyboard would coordinate the desired 
horizontal and vertical movement, thereby executing automatically 
what is otherwise left to the draftsman. This mechanism or integrator 
is visualized to run on tracks by tooth and pinion and to respond ver- 
tically to the operation of the keyboard in such manner that the 
machine-plotted points are either arithmetically or logarithmically pro- 
portional in height in relation to a common reference line. The integra- 
tor should have a flexibility which permits scale variations according 
to the range of data which are to be plotted. The integrator then is to 
have two principal functions, namely, first to locate a point to be 
plotted in relation to the required scale and, secondly, to connect the 
point by a line with the next point plotted, in order to produce the 
required trend line. The printing mechanism should also be designed 
in a manner that when two or more trends are plotted, each will show 
a distinctly different legend or pattern, i.e., solid line, dash line, etc. 

If the mechanical requirements of the integrator, as suggested above, 
involve too many mechanical obstacles, a compromise would be to de- 
sign the integrator to take care of plotting only by locating the points 
correctly, which then could be connected by the draftsman. While this 
would not be complete mechanization, it would be immensely helpful 
in speeding up the completion of a long-term trend chart. The plotting 
points printed by the integrator would have to differ slightly, as for 
example—small solid circles, small open circle points, etc., thereby indi- 
cating to the draftsman that they belong to the same series.‘ 

Due to the manpower war requirements the scarcity of competent 
men trained in the field of statistical drafting becomes greater every 
day. It seems that such a device as the writer visualizes should have 
been on the market long ago and that its construction would have oc- 
curred to the great business machine companies. One reason for its 
failure to appear may be the fact that the market for a machine of this 
nature would be fairly limited. 

The question now arises: What would be the advantages if a charting 


4 The solution of the mechanical integrator will tax the imagination of the designing engineer. The 
vertical movement may be activated through the use of calibrated cams, controlled by the keyboard 
operation or by cylindrical “wedges” or inclined planes attached to drums, similar to those which control 
and activate the motions of automatic machinery, or the solution may be found in an electrically con- 
trolled mechanism such as we find in tabulating machines. Still another direction of solution may be 
found in the application of the photoelectric cell which plays an increasingly important part in the new 
design and perfection of instruments for the automatic preeision control of industrial processes. 
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machine, such as proposed in the preceding paragraphs, were available? 
Doubtless some of the following can be numbered: 

1. It would be possible to turn over routine line charts to an operator 
untrained in drafting. 

2. The tedious plotting of points would be eliminated. 

3. The time for producing line charts would be decreased to a frac- 
tion of that previously required and consequently a greater volume 
could be turned out per man-hour. 

4. The graphs would have a uniform appearance and could be sub- 
jected to photostat or multilith reproduction. 

5. The cost of charting would be decreased and the machine would 
pay for itself in a relatively short time. 

6. There is strong possibility engineers would want so to design the 
machine that it could be connected electrically with the tabulating 
machine unit, with the result that the keys of the mechanism would be 
operated automatically and simultaneously through the punched ma- 
chine cards. Thus the process of counting and graphic presentation of 
time series could be accomplished during the same “run” of cards. 

The above are some of the advantages which would be derived if such 
a machine were available. In conclusion, it should be stated that 
mechanization of statistical drafting is still in its infancy but sooner or 
later progress will be made in that direction. 


COMMITTEE ON NOMINATIONS 


President Lotka has appointed the Committee on Nominations for 
1942. The Committee consists of Frederick F. Stephan, Cornell Uni- 
versity, Ithaca, New York, Chairman; Joseph 8. Davis, Food Research 
Institute, Stanford University, California; and George W. Snedecor, 
Iowa State College, Ames, Iowa. The report of the Committee on 
Nominations will be published in the November BULLETIN. 

R. L. FunxuHovser, Secretary 























SKEWNESS OF COMBINED DISTRIBUTIONS 


By Tueopore E. Rairorp 
University of Michigan 


T OFTEN HAPPENS that investigators, although eager to publish the 
| results of their investigations, jealously guard the basic data from 
which the results were obtained. Since accumulative experience tends 
towards conclusiveness, combined results from several investigators 
on the same problem would often be of very great value. While making 
a study of the combined results from several investigations two points 
have arisen which seem worthy of special presentation, namely, (1) 
a new formula for determining the measure of skewness for the com- 
bined (parent) sets of data in terms of parameters which characterize 
the individual sets (subsets) of measurements, and (2) that combining 
subsets all of which are skew in the same direction does not necessarily 
result in a parent distribution with skewness in that same direction. 
This short paper presents a development of the formula suggested in 
(1) and a numerical example which not only illustrates its use but also 
establishes the truth of statement (2). 

Among the things usually given in the statistical part of a report on 
one’s findings from a single set of data, in addition to the number of 
measurements, are the mean, the standard deviation, and a measure of 
skewness. We shall denote these parameters for the parent distribution 
by the usual letters, without subscripts, namely, NV, M, o, and as, and 
the corresponding parameters from the 7-** subset we will designate as 
ni, M;, o;, and as:;. Moreover, as the measure of skewness we shall use 
the third moment about the mean, expressed in standard units. 

A formula for the mean of the combined distribution from k subsets 
has been presented in many places in the form 


k 


> nM; 


t=] 


M = 
N 


In this notation, N= pL ,n;. One of the most convenient forms for the 
computation of ¢ is one making direct use of the parameters contributed 
from the subsets and based on the formula,! 


> ni(oi? + di) 


go— - » where d; = M;—M. 


1H. C. Carver, Notes on Elements of Mathematical Statistics, Edwards Brothers, Inc., 1939. 
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Forms for the computation of skewness of combined distributions 
have not been so numerous,” but the notation of the above formula for 
o suggests also a concise form in which a3 may be expressed. 

In a distribution in which X denotes the variable, using the notation 
of serial distributions for simplicity, we have the following relations, 


be >> xX? (#2/ 
2 _ ); and 








M=—; @= 
n 7 nr 
n?>, X? — 8n)>, X?- >) X + 2(>, X)® 
ee nio3 (1) 
Co 


where the summations are from 1 to n. From these relations we get 
> X =n-M, 
>> X? = n(o? + M2), and 
>, X? = n-o?-as3 + 3n-0?-M + n-M?. (2) 


If now we denote the measurements in subset No. 1 by Xi, those in 
subset No. 2 by Xe, and so on, we have at once from definition, 


N-o3-a3= >, (Xi—M)*+ >> (X:—M)?+--- + >> (X,—M)? 
= >) X,3-3 >> X.2-M+3 >> X1-M.—n,- M? 
+2 X:3—3 >> X.?-M+3 >> X2-M*—n:- M! 


Il 


+ > xy eae M+3 >> X,-M?—n- M3 (3) 


where the summations in X; are from 1 to n;. If the summations in this 
equation are replaced by equivalent expressions defined by the rela- 
tions given in (2) and if like terms are collected, the above equation 


becomes 
N-a?-az = n[o,3- 371 + 3 do,” ot d,*| 


+ seat a3:2 + 3 dea2” + oe 


+ m% lext-c -a3:k + 3 dyox? + + at], (4) 
From this equation a formula for as; may therefore be written as* 
} No Pa3:5 + + We nid;(30;* + d,*) 
No?* 


2 W. D. Baten, “A Formula for Finding the Skewness of the Combination of Two or More Ex-. 


amples,” this Jovrnat, March 1935, pp. 95-98. 
*T. E. Raiford, “The Measurement of Variations in Asymmetrical Data with Applications to 


Hemoglobin Statisties of Infants,” Human Biology, February 1938, p. 139. 


(5) 





as = 























-SKEWNESS OF COMBINED DISTRIBUTIONS 393 


A tabular form for the application of this formula is shown in the 
table in combining the results from 13 subsets of data presented from 
studies over a period of 13 months of the amount of hemoglobin per 
100 ce. of blood in a certain group of children. Other columns may be 
added for convenience, depending mainly upon what mechanical device 
is used to perform the computations. 









































TABLE I 
Subset ni | Mi; o% | ant | dj oj? +d? nioiar;¢ | nidy 3042 +d? 
1 168 | 13.33 1.92 —.31 1.98 7.6068 —368 .616 332 .64 14.9796 
2 272 11.17 | 1.31 — .35 —.18 1.7485 —214.018 — 48.96 5.1807 
3 329 10.92 | 1.04 +.19 — .43 1.2665 70.315 —141.47 3.4297 
4 332 11.34 | 1.00 — .04 —.01 1.0001 — 13.280 — 3.32 3.0001 
5 | 308 | 11.60 | 1.08] —.36 | 25] 1.2289 | —139.667 77.00} 3.5617 
6 312 11.67 1.05 — .54 | 32 1.2049 —195.036 99.84 3.4099 
7 291 11.45 | 1.00 — .52 |! .10 1.0100 —151.320 29.10 3.0100 
8 273 11.30 | 1.03 —.40 |} —.05 1.0634 —119.326 — 13.65 3.1852 
9 246 11.20 | 1.19 — .37 —.15 1.4386 —153 .383 — 36.90 4.2708 
10 263 11.16 | 1.07 —.44 || —.19 1.1810 —141.762 — 49.97 3.4708 
11 253 11.03 | 1.17 —.25 || —.32 1.4713 —101.302 — 80.96 4.2091 
12 210 10.98 Bee — .55 | — .37 1.3690 —157 .961 — 77.70 3.8332 
13 198 10.98 1.14 —.59 ] — .37 1.4368 —173.073 — 73.26 4.0357 














Eng(og? +5?) =5457.1556; Engog%as.4 = —1858.439; Ingdy(304? +-dg*) =3627.5243 
N =3455; M =11.35; o =1.26; a: = +.26 





The example was chosen as an application of the formula given in 
equation (5) as it is almost a perfect illustration of the second point 
made. As a matter of fact, if subset 3 (the only one with a positive 
skewness) is omitted, the measure of skewness for the parent distribu- 
tion formed by the remaining 12 subsets is found to be a;= +.20, thus 
establishing the truth of statement (2), the second object of this paper. 
It is not the purpose here to discuss the conditions under which subsets 
may properly be combined, but the example does call attention to the 
kind of erroneous statement which may easily be made in a general 
statement. 

Since a; is positive or negative according as the numerator of the 
formula given by equation (5) is positive or negative, let the numerator 
be written in the form >-nioas.;+)_ni di(302+d,) +)>n;‘di(302+d?2), 
where “d; corresponds to the difference in which M;>M and @; corre- 
sponds to the difference in which M;<M. The quantity (30,°+d;*) is 
obviously always positive. If then the as.; are all of the same sign, say 
negative, a; for the combined distribution will be positive or negative 
according as >_n;‘d;(302+d,) is greater than or less than > nioFas:: 
+)on;d,(302+d,). A similar statement is of course true in case the 
a3:; are all positive. 
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Three Aspects of Labor Dynamics, by W. 8. Woytinsky. Washington: Com- 
mittee on Social Security, Social Science Research Council. 1942. xiv, 249 
pp. $2.50. 


This is a statistical investigation into certain of the changes in the com- 
position of the labor foree—principally labor turnover in prosperity and 
depression, the duration of individual employment, and net accessions to 
the labor market under the influence of business depression. It is largely 
an expansion of three earlier summaries published for the Social Science 
Research Council and the Social Security Board. 

The material on labor turnover, constituting Part I of the volume, reviews 
the available statistics, footnotes the special studies, and details the history 
of this field of labor statistics since it was articulated during the period of 
the first world war. Turnover rates are analyzed in relation to business 
fluctuations, seasonal variations, types of industries, localities, skill, sex, 
age, working conditions, and length of service. An attempt is made to isolate 
the “unstable” element in the labor supply. The well-documented conclu- 
sions concerning economic behavior under these varying conditions and 
circumstances are fairly obvious. 

Part II assembles available material on the turnover of the unemployed 
according to duration of inactivity in an effort to determine the “hard core” 
of unemployment. This area of labor statistics has been studied more care- 
fully in recent depressions because of its pertinence to problems of relief, 
reemployment measures and social security. The author is, however, not 
concerned with measures of relieving unemployment. His discussion is 
limited to an explanation of the estimates themselves. 

Part III deals with the phenomenon of supplementary earners in the 
labor market during depressions and contains interesting material on family 
unemployment, including an analysis of the Philadelphia unemployment 
studies. The author presents evidence and interprets available data to 
prove that the apparent larger unemployment in multi-worker families is 
due to the increase in the number of persons seeking employment when the 
chief earner is unemployed. The data permit such an interpretation, but 
this does not appear to be the only explanation. “Humane” employment 
policy of management has endeavored, where it is not too costly, to retain 
workers who are the sole earners in the family. Among the alternatives 
available, this discriminatory lay-off system has been considered good pub- 
lic policy. The principle was formally adopted by the Government in the 
automobile settlement of 1934. : 

It is a common phenomenon of the labor market that employment in- 
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creases with improving business; at such times additional workers in families 
may also be induced into the labor market. This is not to deny that in com- 
munities of diversified industries the differential demand for labor may bring 
the results which Mr. Woytinsky anticipates. The essential fact is, however, 
that there is a labor reserve of persons not actively in the labor market and 
that these persons—boys, young women, married women, and older “re- 
tired” persons—become searchers for jobs only under exceptional conditions, 
such as ease of getting jobs and the inducement of high pay in prosperity or 
the desperate need for cash in depression. 

This sober treatment of Three Aspects of Labor Dynamics is based largely 
on studies in depressions. The “labor dynamics” of the past two years, in- 
volving a radical distortion of customary relations between workers and 
their jobs, are not considered. But, if “normalcy” returns again, the volume 
should be valuable as a summation of information for statisticians and 
economists on the perennial phenomenon of the relation of available jobs 


to the people who live by them. 
Gustav Peck 


Washington, D. C. 


British Unemployment Programs, 1920-1938, by Eveline M. Burns. Wash- 
ington: Committee On Social Security, Social Science Research Council. 


1941. xx, 385 pp. $2.75. 


This authoritative study of Britain’s experience through the inter-war 
years of dealing with unemployment unquestionably will become a standard 
work for students of the subject in all countries. It is a good climax to a 
long series of excellent works on British experience with one of the knot- 
tiest and most universal problems of our times. Dr. Burns presents a 
careful and technical analysis of British experiments and achievements 
in the field. Furthermore, she evaluates the principles developed and basic 
problems involved in British experience with comprehensive knowledge 
and penetrating insight. The text is well documented and is accompanied 
by an admirable statistical appendix. 

The study presents a careful analysis of the three periods since 1920 of 
the British system for alleviating the distress of unemployment; the attempt 
to expand unemployment insurance to provide benefits for both short and 
long term unemployment lasting from 1920 to 1931; the attempt from 1931 
to 1934 to curtail unemployment insurance in the interest of financial sol- 
vency with “transitional” benefits for the long term unemployed which yet 
were distinguished from local public assistance; the program in effect since 
1935 by which a highly integrated but dual system of unemployment insur- 
ance and unemployment assistance is established with financial responsi- 
bility for the latter provided from central tax funds. The author concludes 
that the last scheme is a notable achievement in providing maintenance 
for the unemployed which both reaches a rather high degree of coverage 
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and coincides relatively well with prevailing political principles as to the 
relation between government and the individual citizen. 

The strengths of the present system (the word refers to times prior to the 
present war, obviously) lie in its relative stability, its fair success in meet- 
ing a basic and national minimum standard of life, and its administrative 
procedures. The achievements of the last are due in considerable part to the 
excellent program of local advisory councils and laymen’s participation as 
well as to Britain’s well-known civil service. The weaknesses of the system 
are found in part in the anomalous distinction between the short term un- 
employment insurance and long term unemployment assistance. Beyond 
any such distinction in principle, however, the author finds the most im- 
portant flaw in British programs for dealing with unemployment is the 
failure to provide to any considerable extent for prevention of unemploy- 
ment or adequate rehabilitation of the long term unemployed. 

Dr. Burns concludes that the British have progressed far in the mainte- 
nance of the unemployed. They have done it by developing categorization 
of the unemployed (useful in Britain) and centralization of financial re- 
sponsibility. The system has created a new relation between government and 
citizens with the responsibility of the former clearly accepted though the 
obligations of the latter and the social and economic functions of the modern 
family are still to be defined. The policy of prevention, the counselling and 
retraining of unemployed workers, the creation of work itself, whether by 
public works or by some measure of governmental control over industry 
and central planning has yet to be worked out in Britain as in other western 
countries. 

The reviewer would have liked rather more attention given to the func- 
tions of the employment service in relation to dealing with unemployment. 
The reviewer also questions even more than the author seems to do the 
acceptance by British organized labor groups of the present system of 
maintenance. These, however, are minor criticisms. Dr. Burns’ conclusion 
should be studied by all students of social, economic and political problems 
of modern industrial society. 

MILDRED FAIRCHILD 


Bryn Mawr College 


Dimensions of Society, A Quantitative Systematics for the Social Sciences, by 
Stuart Carter Dodd. New York: The Macmillan Company. 1942. ix, 955 
pp. $12.00. 


The sub-title of this book, A Quantitative Systematics for the Social Sciences, 
accurately describes its content. It consists of a new and ingenious system of 
classification under rubrics transformable into numerical expressions. There 
are six parts: part one deals with the sectors of society, indicia (of traits), 
population, space, time; and the following parts elaborate in detail each of 
these sectors. The author has done a prodigious amount of work in assem- 
bling and classifying an enormous and highly varied mass of material. His 
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system is original and closely knit and, like all his work, shows rare capacity 
for painstaking analysis and organization. 

The S-theory of Dodd’s classification consists of a scheme of hypotheses 
which assert that every tabulation, graph, map, formula, prose paragraph 
(occasionally supplemented by the photograph of an individual, vide Gandhi, 
p. 86, or a composite of movie faces, vide p. 667) or other set of quantitative 
data in any of the social sciences may be grouped under its rubrics. The gen- 
eral sectors are four: 7 =time, L =space, P = population, and J =indicators 
of some trait. Thus S= §(7; I; L; P) 3°. 

By using pre-and-post-superscripts and subscripts (for each sector), the 
author accounts for exponents (self-multiplication), sub-classes (aggrega- 
tion), class-intervals, and cases (sometimes an individual item). Hundreds 
of illustrative specimens are considered and classified by this scheme. The 
heart of the S-theory (p. 41) is the quantic formula and the quantic number, 
since these provide (so the author claims) a thoroughgoing basis of classifica- 
tion for all quantifiable societal phenomena. Thus the quantic formula for 
the frequency distribution is, S=7°®; J'; L°; P', and the quantic number 
lifted off is, 0; 1; 0; 1. The quantic formula for an institution is, Ins. ? =J?; 
P?; T-*. Dodd’s system may be of interest to the pure mathematician, and 
will probably be of great interest to symbolic logicians whose studies concern 
the unity of science. Its value to ordinary statisticians and to empirical social 
scientists will depend upon the applications they can make of this new 
system of classification. The intellectual history of civilization is full of 
systems of classification. A new system is useful when it brings isolated and 
“unknown” things into relationship with some larger synthesis by discover- 
ing that these things, first thought to be unique, are in certain respects simi- 
lar to known things. Perhaps the utility of a new system can be tested by 
asking: Does it help explain causal relationships? (Dodd claims that his 
does.) Does it facilitate prediction? Does it facilitate control? For example, 
Dodd’s treatment of control is chiefly in terms of “social control.” He does 
not seem to exploit fully the possibility of classification and sub-classifica- 
tion as a device to promote control (methodological) by providing new or 
more precise calibrations for matching in experimental designs. Herein lies, 
perhaps, a fruitful application of his system in empirical social science. No 
general answer can be made to these questions. The reader can find the an- 
swer himself when applied to his own specialty. 

F. Stuart CHAPIN 


University of Minnesota 


Social Research: A Study in Methods of Gathering Data, by George A. Lund- 
berg. New York: Longmans, Green and Company. 1942. xx, 426 pp. 


$3.25. 

This is the first revision of a book which was published in 1929, It is almost 
completely rewritten, though it follows the general outline of the original 
edition in dealing with the philosophy or logic of science and concrete 
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procedures. Two new chapters are concerned with questionnaires and socio- 
metrics. Other chapters discuss the sample, schedule, measurement of 
attitudes, opinions and institutional behavior, field work, and social book- 
keeping. 

Like its predecessor, the revision is on the whole a simple and well- 
organized survey of problems and difficulties in social research. This com- 
ment refers particularly to the section from Chapters V and on in which 
each chapter contains well chosen examples of techniques of investigation 
and is concluded with a concise summary and bibliography. As a text, the 
book should be as successful as the first edition. 

The author does not fail to point out both in his introductory four chap- 
ters and in critical comments in the summary of special techniques the 
pluralism of social science in its philosophies of research. If the book has 
any one distinctive lack, it is the failure to describe this conflict as suc- 
cintly and clearly as the particular deficiencies which may occur in sam- 
pling, schedule making, and scale construction. Most graduate students 
will have more difficulty in understanding this lack of coherence in science 
as such than in the interpretation and use of the instruments of research. 

Social statisticians will probably find Chapters VIII, IX, and X to be the 
most interesting. Therein the explorations of the sociologist in measurement 
and sociography are described. Similarly Chapter XI on field work should 
prove to be an excellent guide to a variety of professional students. Chapters 
VIII-X are complete and constitute as fine an analysis of the subjects dis- 
cussed as can be found in the literature. 

Specific shortcomings in the opinion of the reviewer, in addition to the 
politics of research, are the omission of a definite statement of what research 
and science are so that the layman may detect the numerous fads now cur- 
rent under the label of science, too few examples of research methodology 
from other social science fields than sociology, and failure to stress the point 
which is mentioned but not amply covered when “judgment” outweighs 
the virtue of all techniques. Moreover, in the use of studies as examples of 
research which appear to have unequal scientific merit, there is apt to be 
some confusion as to the precise limits of what is acceptable as science in 
sociology. 

Harowp A. PHELPS 

University of Pittsburgh 


Methods of Correlation Analysis, by Mordecai Ezekiel. (Second edition.) 
New York: John Wiley and Sons, Inc. 1941. xix, 531 pp. $5.00. 


This book contains the same leisurely, readable development of correla- 
tion methods which made the first edition an outstanding success. All of 
the desirable features have been retained and some important improvements 
in exposition have been made. Of the relatively few additions to subject 
matter the most important is the emphasis on logical limitations to flexi- 
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bility of regression functions. Another interesting addition is the study of 
the reliability of an individual forecast in Chapter 19. 

Since it is unnecessary to enumerate the features of a book so well known 
to readers of this JouRNAL discussion of some changes the reviewer would 
have liked to see is in order. 

None of the recent methods for computing correlation constants based 
on determinants is seriously considered by the author. In the reviewer’s 
opinion this is an unfortunate omission. Presentation of Stephan’s treatment 
of the wheat problem (this JournaL, March, 1931, p. 58) would serve to 
make a valuable but neglected device available to a wider group of readers. 

The use of degrees of freedom in the table of ¢ (Table A) would eliminate 
confusion in testing correlation and regression coefficients. A different sym- 
bol should be used for the number of variables when n is used in the same 
equation for the number of observations in the sample (p. 211). In view of 
the great emphasis on “‘corrected”’ measures of correlation and “unbiased 
estimates’’ of standard errors of estimate a correct statement of the known 
properties of such estimates is badly needed. They are not median estimates 
(p. 143); neither are they unbiased in the usual sense of that word. Although 
discussion of part correlation (involving one independent variable) has been 
dropped from the text, it is defined in a footnote and derived in Appendix 2. 
Part correlation involving two independent variables is discussed in squared 
form (p. 218). The corresponding function appropriate for tests of sig- 
nificance, (R.2342—12")/(1—ri2*), is simpler and has a “Beta” distribution 
with 2 and n —4 degrees of freedom. 

The author refers to complicated techniques and tests of significance in 
a manner suggesting a degree of exactness and applicability which they do 
not possess (pp. 99, 320, 323, 367). The standard error of the mean which is 
exact under less restricted conditions than any other sampling formula in 
the book is “proved” in such a way as to suggest that it is approximate. On 
the other hand, the standard error of r is said to be “precise” (p. 318). The 
appropriate formula (in which n —1 is substituted for n —2) is exact for un- 
correlated normal bivariate universes, but even the appropriate formula is 
an approximation for correlated universes. Standard errors for the index of 
correlation and the multiple correlation coefficient (given without qualifica- 
tion) are greatly in error and of questionable usefulness. The author even 
advocates the t-test for each of these inherently positive statistics and for 
“more exact interpretations” he suggests Fisher’s z-transformation! 

The foregoing criticisms of the material on sampling are not intended to 
detract from the merits of a great book. According to the preface the book 
is intended to cover “that portion of the field which is concerned with study- 
ing the relationships between variables.” Appraised on this basis the book 
has no equal nor even a close rival. 

Joun H. Smita 
Bureau of Labor Statistics 
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Principles of Punch-Card Machine Operation, by Harry P. Hartkemeier. 
New York: Thomas Y. Crowell Company. 1942. xiv, 269 pp. $3.25. 


While there have been several studies which described specific applications 
of punched card equipment in accounting and statistical research, the re- 
viewer is not aware of the publication, prior to Dr. Hartkemeier’s mono- 
graph, of any text purporting to present an elementary treatment of the 
general problem of machine tabulation. Certainly the manuals which 
accompany the equipment are entirely inadequate for instructional purposes. 

Professor Hartkemeier’s text confines itself to instruction in the use of 
International Business Machines equipment and begins with a brief account 
of the historic development of the Hollerith invention. A statement of the 
sorting and tabulating principle is followed by a detailed description of the 
operation of the almost obsolete Type 3 Printer. Next, the standard IBM 
numeric tabulators (Models 285 and 297) are discussed in great detail. In- 
struction is given in wiring the tabulator plug board for listing, addition, 
subtraction, the use of balance counters, class and field selection, and the 
X-distributor. 

Out of the 153 pages devoted to the introduction and to numeric equip- 
ment, approximately 30 are actually textual; the rest consist of photostat 
copies of wiring diagrams and of the reports prepared on the tabulator from 
the various wiring diagrams displayed. Perhaps another 15 pages of text are 
devoted to the description, in part III, of the alphabetic accounting machine. 
This more modern apparatus, which the reviewer believes ought to be the 
almost exclusive subject of a text on the use of punched card equipment, is 
treated not as the only tabulator with which the modern student of punched 
card methods is likely to come in contact but rather as an incidental stage in 
the historic development of the numeric equipment. 

Most of the ordinary wiring problems are well discussed and there is a 
good section on the use of punched cards in obtaining sums of squares and 
cross products. . 

While Dr. Hartkemeier has performed a real service in meeting a long felt 
need for a text on punched card machinery, severai possibilities for improve- 
ment suggest themselves. The types of IBM equipment discussed constitute 
a small proportion of that company’s offerings in the punched card field. 
The summary punch, the reproducer, the gang punch, the collator, the 
multiplying punch, the interpreter, the various models of hand-operated 
numeric and alphabetic punches, and the mark-sensing reproducer receive 
little or no mention in this text. 

Since no discussion is given of competing equipment, there is no opportun- 
ity to point out the types of applications in which rival lines have a material 
advantage, such as the punching of cards for which a portion of the material 
is to be repeated from card to card or the listing of material requiring alpha- 
betic characters throughout the line of printing. 

It would have been helpful if more attention had been paid to possible 
fields of application of the techniques described. Except for the section on 
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progressive digiting (for the accumulation of sums of squares and products) 
the illustrations are confined to elementary accounting problems. 

The reference system by which diagrams and forms are located is not well 
conceived. The reader may find it confusing to locate diagram 12 on page 
61, diagram 12-405 on page 189, and diagram 13 on page 62. The small 
figures which show the position of the pin in the clearing collar (on page 12) 
were omitted in earlier printings but this has been corrected in later runs. 
While the loose-leaf format simplifies the extraction of an individual wiring 
diagram for comparison with the relevant text and also makes it easier for 
the student to turn in the blank diagrams supplied with the text after com- 
pleting them, a number of copies have appeared with defective rings in the 
loose-leaf binder, which have made the use of the volume somewhat awk- 
ward. 

Francis McINTyrRE 

Office of Lend-Lease Administration 


The Fundamental Principles of Mathematical Statistics, by Hugh H. Wolfen- 
den. New York: The Actuarial Society of America. 1942. xv, 379 pp. 


The full title of the book is The Fundamental Principles of Mathematical 
Statistics with special reference to the requirements of actuaries and vital sta- 
tisticians, and an outline of a course in graduation. The author is a fellow in 
three of the principal actuarial organizations of Great Britain and the United 
States and of the Royal Statistical Society. The subject is developed in 
terms of the needs of a particular field, in this case actuarial work and to a 
slight extent vital statistics, as contrasted, say, with business, agriculture, 
biology, psychology or engineering. The book could be described in the words 
which Yule applied to his own text, “definitely founded on experience, 
personal experience in statistical work and personal experience in teaching.” 

There are biographical and bibliographical references throughout the 
text and a section of 25 pages devoted exclusively to the history of the 
developments of mathematical statistics. The author explains in the preface 
that “in teaching these matters there is, inevitably, a cultural responsibilty 
[beyond]...a recital merely of the present state of knowledge... . I 
believe it essential to the proper understanding of any subject to absorb the 
history of the mental processes which have guided its development.” The 
American reader will note with interest the highly appreciative references 
to the work by Erastus L. DeForest whose noteworthy and long neglected 
researches in graduation appeared in The Analyst of Des Moines, Iowa, sixty 
years ago. 

Some idea of the scope of the book as compared with typical American 
textbooks can be gained by a list of exclusions and inclusions. Not found are 
such topics as kurtosis, time series, index numbers, correlation. The fitting 
of regression lines and the analysis of variance receive one paragraph each, 
consisting chiefly of bibliographical references. The meaning of probability 
is relegated to the second section of the book since the actuarial students 
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“come ... with a sound practical knowledge of the elements of the theory of 
probability.” The notation of mathematical expectation is excluded though 
not the term. 

On the other hand, frequency distributions and associated problems of 
sampling receive extensive treatment; brief but systematic presentation is 
given the systems associated with the names of Pearson, Gram-Charlier and 
Poisson-Charlier; and considerable space is given to graduation and fitting 
of curves for the quantities playing a role in the actuary’s office, /,, log L., 
gz, and ten other similar functions. 

The mean of Bernoulli distribution is found as m =np, but with the Poisson 
distribution, p. 60, the mean is given as m =nq. Is it not a “cultural respon- 
sibility” to let p be the probability of death by a Bortkiewiczian horse kick? 
The actuarial student can then be told that he will often take the numerical 
value of p from the g, column of the life table. 

The formula for the standard error of a linear compound, p. 25, could have 
been found more easily and with greater generality by merely assuming 
independence, not normality. The student will need further help at p. 39 
where variance seems to be referred to as a linear function of the variates, at 
p. 41 where clarity would come from the concept of all possible samples 
rather than “n-samples,” at p. 42 where the formulae for standard errors of 
various parameters need such annotations as: mean, “exact”; median, “only 
approximate even for normal distributions with n large”; standard deviation, 
“good approximation for normal distributions”; variance, “exact for normal 
provided the formula is changed from n to n—1”; etc. He will need help at 
p. 52 where the test of the sigmas is presented as “‘subject to the reservation 
that ,o, and »o, must not be unusual,” whereas in fact the restriction applies 
only to the sigma to be used in the denominator. 

These and other negative points are offset by insights and generalizations 
along the way which arouse the reader’s gratitude for a thoughtful treat- 
ment of the subject from an important point of view at the hands of one who 
takes teaching seriously and who speaks from wide knowledge and experi- 


ence. 
Sipney W. WILcox 


U. S. Bureau of Labor Statistics 


National Income and its Composition, 1919-1938, by Simon Kuznets. New 
York: National Bureau of Economic Research. 1941. Two vols., xxx, 980 
pp. $5.00. 

This double volume is a report on a major investigation of the country’s 
national income during the indicated years. In Part I, the design and ration- 
ale of national income estimation are examined and the author’s many elec- 
tions in choosing between alternatives reviewed. Next, the estimates are 
presented and analyzed at some length, in terms of totals and different types 


1 See, for example, J. H. Smith, Tests of Significance, University of Chicago Press, p. 57. 
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of components. Part III discusses the derivation of the estimates and notes 
important characteristics resulting from the nature of concepts, methods, 
and source materials. This section also compares the estimates with other 
income computations and introduces the innovation of subjecting the figures 
to comprehensive tests of reliability. Final sections cover data, sources, and 
methods in detail. Where pertinent, chapter summaries supply a concise 
recapitulation of subject matter. 

This publication is important to “consumers” as well as “producers” of 
national income data because of the breadth of vision with which issues are 
discussed and because of the systematic purposefulness with which the 
author’s investigation of national income was conducted and is detailed to 
the reader. Naturally, with half its pages devoted to methodology and data 
it will not normally be studied from cover to cover. Yet the constructive 
analytic spirit displayed throughout weaves around this type of material an 
exposition that is remarkable for the insight it provides. Noteworthy is the 
careful choice made between a sterile counsel of perfection on one hand and 
ill-considered superficiality on the other—a stimulating example to persons 
interested in national income. 

In view of its scope, this work has remarkably few controversial features. 
Probably the most important controversy regarding design centers about the 
use of taxes in valuing governmental services, a choice between “two evils” 
(p. 32) which leads to the measurement of public services to businesses by the 
payments of these enterprises to government. Analysis of data tends to be 
mechanistic. Thus, the study of trends is unduly influenced by the use of 
averages for certain 5 and 10 year periods because insufficient recognition is 
accorded the cyclical settings of these averages. The data also have their 
practical limitations in that they end in 1938, fail to incorporate recently 
available information, are difficult to use because of the number of variants, 
and contain intrinsic deficiencies. The last is illustrated by Variant II of the 
implicit consumers’ outlay price index (p. 145) which moves questionably 
closely with wholesale prices after 1933. But these are relatively minor to 
this important treatise on a most timely topic. 

Dwicut B. YNTEMA 


U. 8. Department of Commerce 


Exchange Control in Central Europe, by Howard 8S. Ellis. Cambridge: Har- 
vard University Press. 1941. xiv, 413 pp. $4.00. 


Professor Ellis has written what is so far the most detailed history of the 
totalitarian type of control in international trade, leading through the jungle 
of administrative make-shifts and their economic implications in three 
countries: Austria, Hungary, and Germany. Extensive analyses in the light 
(or language) of current monetary theories complete this work of remarkable 
erudition. 

The greater one’s respect for the author’s painstaking and penetrating 
analysis, the more disappointing is the interpretative accomplishment. Its 
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main thesis is that upholding the (nominal) parity with the aid of exchange 
control was excusable in the crisis, but became harmful, and devaluation 
should have been substituted in 1936, if not in 1933. Of course, Dr. Ellis 
knows that devisen control was due to extraordinary circumstances, economic 
and political, but insists on condemning exchange regulations in comparison 
to open devaluation. He holds Austria as a shining example because she 
abandoned the regulations (except agreements with clearing countries). But 
Austria’s short term debts were virtually wiped out in the Credit-Anstalt 
arrangement, her long term foreign debts were effectively reduced, and fresh 
credit came through the League. Moreover, what the author forgets: Austria 
received fairly favorable commercial treatment and was encouraged in 
many ways while the others were to some extent boycotted.! Her balance of 
payments had been brought into sufficient equilibrium to eliminate capital 

}flight, while the other two had to face the “music” throughout the period. 

: Much of the analysis is scientifically worthless, because it fails to take 
well-known facts into account. Dr. Ellis imputes repeatedly the vanishing 
uf export surpluses to currency over-valuation, but admits in other places 
hat the currencies were de facto devalued. The question, whether bilateral- 

“sm wasn’t the way permitting Germany to maintain even a limited volume 

bs exports and to service partially her debt, is omitted altogether. What 
bout the rapidly increasing foreign resistance against Germany’s exports, 
he exhaustion of her raw material inventories, the effects of a growing in- 
he market due to “reflation” policies, etc.? How can one judge exchange 

policies without allowing for such factors? Could devaluation protect a 
jebtor against capital flight and the run of foreign creditors? Would not 
yerman devaluation have caused repercussions, which an unimportant 
urrency like the Austrian shilling did not provoke? Is it permissible to 
Frpute all consequences to a single set of causes when a multitude of (un- 
—— factors was at play? 

Dr. Ellis believes in the highly controversial Casselian purchasing power 
Lasities. Admitting that it is impossible to figure out equilibrium exchange 
yates under altogether artificial conditions, he proceeds to apply the con- 
{iruct for judging practical policies, using only visible imports and exports 
2 that. The Germans and Hungarians are blamed for not having acted upon 
an equilibrium rate that existed only in theoretical imagination. The appar- 
ent equilibrium rates depended in reality upon expectations of governmental 
action. Much perspicacity is devoted to comparing German and British 
“terms of trade.” But this concept is meaningless when the economic alterna- 
tives are “to be or not to be,” to say nothing of political issues. Lastly, Dr. 
Ellis operates under a twofold misunderstanding: that exchange control 
necessarily leads to autarchy and totalitarianism; and that its abolition 
means the return to normal trade and political systems. In reality, exchange 
control was merely the most extreme among the super-protectionist systems 





1 Neither the importance of tourist traffic for Austria’s balance of payments, nor the preference 
given Austria by Western tourists is mentioned. 
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of the 30’s; the same objectives could be, and have been, accomplished in the 
framework of the British and French types of exchange policies. The alleged 
results of bilateralism—reducing the volume of trade, changing its direction 
and composition, and extending monopolistic control—were attained else- 
where by multilateral methods. The greatest weakness of the book is the lack 
of a comparative picture, and the consequent bias against one system.’ 
MELCHIOR PALYI 


Chicago 


The Federal Reserve Bank of Cleveland, by Arthur F. Blaser, Jr. New York: 
Columbia University Press. 1942. xxvii, 306 pp. $3.50. 


The Federal Reserve Bank of Richmond, by Charles G. Coit. New York: 
Columbia University Press. 1941. xv, 140 pp. $2.00. 


The organization of these books and the questions with which they deal 
bespeak their common origin. They are the most recent of a series of studies 
initiated by the Banking Seminar at Columbia University to cover each of 
the twelve Federal Reserve Banks. Others have already been published for 
the New York, Chicago, San Francisco and Boston Reserve banks. 

Well organized, brief, and readable, Mr. Blaser’s and Mr. Coit’s studies 
describe the commercial, industrial, agricultural, and financial activities 
and institutions of the Cleveland and Richmond Federal Reserve districts. 
The authors’ main question is the extent to which the Cleveland and Rich- 
mond Reserve banks have operated as semi-autonomous central banks in 
their respective districts, as intended by the framers of the original Federal 
Reserve Act. They mutually conclude that the banks are rather administra- 
tive branches than semi-autonomous units. To reach this conclusion hardly 
requires a couple of books. It will go unchallenged. The same conclusion has 
been reached independently by many students, and by more persons who are 
not, who are interested in banking and aware of the obvious tendencies of 
government in the better part of the past decade. 

The one possible exception in which the authors concede some autonomy 
of local action is discounts, since these are made directly by the Reserve 
banks. But, as the authors point out, there have been no discounts worth 
mentioning for some years, and discount policy, including the rate, is subject 
to Reserve Board regulation. 

One achievement, however, can be credited to the framers of the Federal 
Reserve Act in the matter of local autonomy. Thanks to them, the Reserve 
banks have been staffed for the most part with men who have originally 
come from the districts or have become established residents there. This 
gives to bankers and local business men a measure of assurance that they are 
dealing with people who have some knowledge and sympathy for their own 

2 Dr. Ellis’ bibliography indicates his bias by omitting such pertinent literature as J. Trier, H. 
Luckas, F. Schaum, P. Raber, Pertot, etc. Serious students of the 30's, like H. J. Tasca, K. E. Poole, 


H. Heuser, and C. W. Guillebaud, who emphasized the difficulties of debtor countries, are barely 
noticed or actually ridiculed, while a dilettantic propagandist like Th. Balogh is approvingly quoted. 
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conditions and problems. Thus, in the realm of administration, if not in the 
realm of policy, a large measure of local autonomy has been achieved, which 
the Federal Reserve Governors on more than one recent occasion have 
pointed out as administratively desirable. 
Victor M. LoNGsTREET 
Board of Governors of the 
Federal Reserve System 


Federal Crop Insurance in Operation, by J. C. Clendenin. Stanford Uni- 
versity: Food Research Institute. Wheat Studies, Vol. XVIII, No. 6. 
March 1942, 229-290 pp. $1.25. 


This paper outlines extensively and discusses many phases of the Federal 
Wheat Crop Insurance Program. It reports the results of study of a wide 
variety of information such as regulations and procedures, basic individual 
farm data, questionnaire surveys sent to farmers and businessmen, conver- 
sations with farmers and with planning, administrative and operational 
employees. The insurance contract, actuarial details, participation, unit 
costs, public relations, administration and finances also are treated. Foot- 
notes give many details. There is an appendix of state and county data and 
a brief international history of crop insurance. The report is typical of those 
sponsored by the Food Research Institute in breadth, clearness and balance. 

The annual contract guarantees a yield equal to 75 per cent (or 50 per 
cent at the election of the applicant) of average for the farm. The Corpora- 
tion collects premiums and pays indemnities in either the wheat or the cash 
equivalent of a specified class and grade but does not protect against loss in 
quality. The per acre cost of insurance closely approximates the annual 
average bushels per acre loss which would occur on the farm in a representa- 
tive period of past years. The individual farm rating basis rather than a 
community or county basis is approved but it is suggested that even a 
smaller or individual field unit rating basis may become advisable. 

The guarantee of yield rather than value is commended as economically 
sound. The scope and interpretation of insurance coverage is called “quite 
satisfactory.” The risks covered include practically every hazard beyond the 
control of the insured and the risks not covered are those which are due to 
neglect, malfeasance, theft, fraud, or unsound farming practices. Loss adjust- 
ments are “reputed to be fair, sometimes verging on severe, and to reveal 
few cases of lavish settlements.” The author concludes that this is encourag- 
ing for laxity would quickly demoralize the program. A suggestion is made, 
however, that there is some probability that adjustment standards may be 
responsible for part of the underwriting losses. 

Excess of indemnities paid over premiums collected in each of three years 
is attributed also to inaccurate and incomplete basic farm data and method 
of use, absence of any but casual check on the fields seeded except for acre- 
age, and unsatisfactory distribution of risks. 

The Federal Crop Insurance Corporation’s program is coordinated with 
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that of the Agricultural Adjustment Agency. Mention is made of the low 
cost of field administration made possible by the use of county AAA com- 
mittees. The value of their knowledge of local conditions ana of their sales 
prestige is properly emphasized. However, the “caliber” and attitudes of the 
committees vary. Some are not insurance minded or are untrained as sales- 
men and, in part, are responsible for uneven distribution of risks and lax 
appraisals. 

The extent and volume of participation, which cover about one-fifth of 
the text, are called “phenomenal” in the first two years. “The third year 
brought a sharp decline in the rate of growth . . . and yet policies covered 
over 17 per cent of the total seeded acreage and about the same percentage 
of the nation’s wheat growers.” The author seems convinced that the dis- 
tribution of business shifts chiefly with shifts in soil moisture reserves and 
recent loss experience but he proceeds inconsistently to stress the importance 
of farmer sales resistance to relatively high rates in the high-risk area. 

A public policy section of the paper expresses doubt that insurance en- 
courages uneconomic land use, discusses premium wheat storage problems, 
and considers the adaptability of the program to types of farming regions. 

The Federal Government bears the administrative cost of the program. 
Unit costs are thought to be high and in the main irreducible except by a 
material increase in business volume, but this opinion is based on several 
questionable assumptions. 

The author concludes finally that wheat crop insurance should be given 
a thorough trial by testing several alternative details before it becomes 
crystallized. A schedule rate, premium reduction for small loss experience, 
and a term policy are important alternatives suggested. Similar plans have 
been under consideration for some time by the Corporation and are now a 
part of the contract being currently offered. 

Such a well-rounded picture of the activities of the Corporation under 
one cover has been lacking in the published literature. There are some 
inaccuracies and questionable assumptions but they are not important inso- 
far as a comprehensive presentation of the program is concerned. The paper 
merits a well accepted contribution. 

Ricwarp O. CROMWELL 


Federal Crop Insurance Corporation 


American Highway Policy, by Charles L. Dearing. Washington: The Brook- 
ings Institution. 1941. xi, 286 pp. $3.00. 


The public policy problems involved in building, maintaining, financing, 
and policing our highway system of more than three million miles are obvi- 
ously numerous and complex. These problems though interrelated may be 
conveniently grouped in three broad categories: (1) determining a satis- 
factory method of distributing authority and responsibility among the 
several levels of governments; (2) allocating the financial burden among 
individual taxpayers; and (3) resolving the long-standing controversies 
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between highway transportation enterprises and competing modes of trans- 
portation. The present volume was written in response to a request by the 
Commissioner of Public Roads of the United States Public Roads Adminis- 
tration addressed to the Brookings Institution to institute a study of the 
underlying principles according to which the abundant data already collected 
by the state-wide planning surveys might be employed in the solution of 
the controversial problems of highway financing and administration. 

Mr. Dearing observes that the provision of public roads has always been 
deemed to be an essential service of government. He states that “the modern 
road plant is a multiple-purpose facility,” serving “in one degree or another 
to give access to land and buildings; to facilitate the movement of goods and 
people primarily associated with community life; to supply the avenues of 
optimum intercommunity mobility; and ... to expedite the administration 
of various essential functions of government.” The author proposes to have 
the roads in each state classified along lines of use, with the so-called “general 
purpose” roads under the exclusive jurisdiction of the state agencies. He 
would assign the major financial burden of supporting the general purpose 
roads to motor vehicle users under “a system of levies designed to measure 
differential road occupancy as well as the cost of any additional physical 
facilities provided to meet the requirements of unusual size and weight 
characteristics.” The remaining roads, which he says are generally used for 
community service and land access purposes, would be administered and 
mostly financed by local units of government. 

Mr. Dearing believes that Federal participation in the road program 
should be confined to “those types of activities which are designed to serve 
broad national objectives.” In order to help limit the activities of the Federal 
Government in the highway field, he proposes to reserve to the states ex- 
clusive jurisdiction over the administration of the special motor vehicle user 
charges. He also expresses strong opposition to the so-called “public utility 
method” of managing the entire road plant as if it were a regulated public 
utility competing at all points with the railroads. 

In this limited space, there is opportunity only to suggest, by presenting 
a few examples of his views, that the author has identified and thoughtfully 
analyzed the salient issues in highway public policy. The book is well or- 
ganized and concisely written and will repay close study by all who are 
concerned with highway problems. 

Raupu L. Dewey 

Washington, D. C. 


The Marketing of Used Automobiles, by Theodore H. Smith. Columbus: 
Bureau of Business Research, Ohio State University. 1941. xv, 290 pp. 
$3.00. 


This book covers a larger field than is suggested by its title and is really a 
study of the marketing of both new and used cars from the beginning of 
automobile production through the year 1940. The reason for the inclusion 
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of new car sales information is that the two subjects are virtually inseparable. 
Used cars are, in the main, sold by dealers in new cars, while used car prices 
and sales volume are directly and powerfully affected by new car prices and 
sales volume. 

A great amount of factual information is presented, much of it in statistical 
form, including forty-five tables and eleven charts. The author’s comments 
and explanations are lucid and intelligent. The book should be a convenient 
source of reference for related information, such as the origin and growth 
of the sales finance companies, and various sidelights on the motor vehicle 
manufacturing business. For instance, we are told that the National Used 
Car Market Report for June 1915 gave price information on 154 makes of 
gasoline and steam cars and 14 makes of electric cars, these being “only the 
better known makes.” This is quite a contrast to conditions in 1941, when 
there were only 12 manufacturers of passenger automobiles in the United 
States. 

The book is not, as might be supposed, a compendium of advice to dealers 
on how to sell used cars, but is rather an historical treatise on how used cars 
have been sold, with discussion of the practices used, and their results. How- 
ever, it would doubtless be worthwhile for any erstwhile automobile dealers 
who expect again to be in that business, when it is restored to life after the 
war, to read the book. 

The automobile business differs from all others in the following respects: 
(1) The new product is sold to the public only through “enfranchised” 
dealers; (2) these dealers are required to pay in full for their stock in trade 
before delivery; (3) in the majority of sales the dealer has to accept used 
merchandise as part payment of the price of the goods sold; and (4) due to 
the high unit price, the majority of buyers cannot raise enough money to pay 
in full at time of purchase, but must buy on instalments, or not at all. 

These conditions produce, among others, the following results: (1) The 
manufacturers have the power, which they have always used, more or less, 
to “coerce” the dealers as to methods of conducting their business; for the 
manufacturer can cancel the dealer’s franchise, and put him out of business; 
(2) the average dealer cannot exist without wholesale and retail financing. 
He must borrow from a finance company most of the money needed to buy 
his new cars and he must sell to a finance company the instalment contracts 
resulting from his retail sales; (3) sales competition between dealers centers 
about the trade-in allowance rather than the retail sales price; (4) the dealer 
almost always loses money on the trade-in. He allows more for it than he can 
sell it for. 

It is because of these conditions that many unique problems arise in the 
retailing of automobiles and especially of used automobiles. Mr. Smith’s 
book discusses all of these problems in a highly intelligent manner. 

MiLan V. AYRES 
Chicago 
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The Development of American Industries, planned and edited by John G. 
Glover and William B. Cornell. New York: Prentice-Hall, Inc., 1941. 
xxviii, 1005 pp. $5.50 (trade) and $4.50 (school). 


This revised edition follows closely the first edition issued in 1932. In the 
main the exact content is retained except for modifications to incorporate 
more recent data and to cover new legislation affecting industry, new meth- 
ods of production, and new products. Some of the data, however, remain 
obsolete or are otherwise not fully satisfactory for measuring conditions in 
the industries or changes therein. Specifically, where reliance has been placed 
on data for scattered years, the measurement frequently reflects temporary 
conditions more largely than basic trends. This weakness could have been 
overcome by a more careful selection of comparable periods or by the intro- 
duction of charts to present a more complete picture in the available space. 

Although the authors state in the preface that “the effect of war conditions 
abroad and of our own national defense program has not been unduly empha- 
sized,” they would have been more nearly correct to state that they have 
given almost no consideration to that phase. This is true even though the 
metal, machine-tool, aircraft, power, chemical, and shipbuilding industries, 
covered in various chapters, were vitally affected long before the publication 
of this edition in September 1941. 

A survey of prewar industries, of course, may provide a useful background 
for the study of current problems. This work remains one of the best collec- 
tions of essays for the non-technical reader on industrial development. Al- 
together it contains 39 chapters on 38 industries, plus an initial chapter on 
labor’s contribution to American industries and a concluding chapter on 
trade associations. Most of the chapters have been prepared by executives 
in business concerns or trade associations. The contributions are of highly 
variable quality. Owing to the sparsity of references, the intellectually curi- 
ous reader will, no doubt, be left with a feeling that he has been given a hasty 
introduction without additional guidance. If he chooses to investigate, he 
will find that many of the statements are drawn from readily available 
sources. References would aid also in checking on information, which is 
not always reliably presented. Several summaries of legislation, for example, 
are misleading to the extent of confusing proposals with actual enactments. 
This deficiency of references and directives to outside sources is counter- 
balanced to some extent by an unusually good index to the material within 


the volume. 
WILBERT G. FRITz 


National Resources Planning Board 
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